Open sven-lange opened 8 months ago
Maybe there is more at work here... as adding the output flag with gets rid of the following error, but no final model is written?
Traceback (most recent call last):
File "/programs/x86_64-linux/topaz/0.2.5_cu11.2/bin/topaz", line 33, in
I experience the exact same issue.
I also thought this should be an easy fix, but apparently there is more to it. Training works well, but somehow the path variable is never properly defined.
This is annoying since it seems so minor and training runs through just the weights are not saved...
I have just checked back on it.
Turns out the models (.sav files) are still written despite the error thrown after the last epoch as been run through. The error is weird, it also looks for in the systemwide pyhton library and not limits to the conda environment, at least when run in jupyter notebook via !command. When run in a terminal with the environment active, it will throw the same error but limits it to the conda environment, which I had expected for the notebook as well when the kernel is connected with the environment.
Anyways, despite the errors, the models are written out and can be inputted for a subsequent denoising.
Why there is no response within 3 weeks is unusual for the Topaz developers, as far as what I have seen, they are usually quite active over here...
Can you share the exact command you ran and your python environment (e.g., conda list
)?
Here is the command that I run that produces the error:
!topaz denoise -a /Data/erc-3/ddannecker/noise_analysis/output/idpc/denoised_movies/topaz/even_set/ \ -b /Data/erc-3/ddannecker/noise_analysis/output/idpc/denoised_movies/topaz/odd_set/ \ -d 1 --save-prefix /Data/erc-3/ddannecker/noise_analysis/output/idpc/denoised_movies/topaz/trained_models/ \ --preload
The command is run inside the jupyter notebook provided by you for the tutorial on denoising using n2n style.
Output:
using device=1 with cuda=True
training with 30 image pairs
validating on 3 image pairs
epoch loss_train loss_val
1 0.06858793869614602 0.03378116339445114
2 0.033914549400409055 0.03444984555244446
3 0.03185074913005035 0.028351619839668274
4 0.03078408936659495 0.02774738147854805
5 0.029715640718738237 0.027392005547881126
6 0.029072204977273943 0.028292417526245117
7 0.029457507282495497 0.026968898251652718
8 0.029775951181848843 0.026812613010406494
9 0.028571538378794985 0.026883115991950035
10 0.028614555423458417 0.026838932186365
.
.
.
Traceback (most recent call last):
File "/programs/x86_64-linux/topaz/0.2.5_cu11/bin/topaz", line 11, in
The environment I am using is the one suggested in the installation instruction for the topaz environment.
Here the output for conda list (after activating the respective topaz environment).
packages in environment at /Data/erc-3/ddannecker/miniconda3/envs/topaz:
Name Version Build Channel
_libgcc_mutex 0.1 main
_openmp_mutex 5.1 1_gnu
backcall 0.2.0 pyhd3eb1b0_0
blas 1.0 mkl
bzip2 1.0.8 h7b6447c_0
ca-certificates 2023.12.12 h06a4308_0
certifi 2021.5.30 py36h06a4308_0
cudatoolkit 11.3.1 h2bc3f7f_2
cycler 0.11.0 pyhd3eb1b0_0
dataclasses 0.8 pyh4f3eec9_6
dbus 1.13.18 hb2f20db_0
decorator 5.1.1 pyhd3eb1b0_0
entrypoints 0.3 py36_0
expat 2.5.0 h6a678d5_0
ffmpeg 4.3 hf484d3e_0 pytorch
fontconfig 2.14.1 h52c9d5c_1
freetype 2.12.1 h4a9f257_0
future 0.18.2 py36_1
giflib 5.2.1 h5eee18b_3
glib 2.69.1 h4ff587b_1
gmp 6.2.1 h295c915_3
gnutls 3.6.15 he1e5248_0
gst-plugins-base 1.14.1 h6a678d5_1
gstreamer 1.14.1 h5eee18b_1
icu 58.2 he6710b0_3
intel-openmp 2022.1.0 h9e868ea_3769
ipykernel 5.3.4 py36h5ca1d4c_0
ipython 7.16.1 py36h5ca1d4c_0
ipython_genutils 0.2.0 pyhd3eb1b0_1
jedi 0.17.2 py36h06a4308_1
joblib 1.0.1 pyhd3eb1b0_0
jpeg 9e h5eee18b_1
jupyter_client 7.1.2 pyhd3eb1b0_0
jupyter_core 4.8.1 py36h06a4308_0
kiwisolver 1.3.1 py36h2531618_0
lame 3.100 h7b6447c_0
lcms2 2.12 h3be6417_0
ld_impl_linux-64 2.38 h1181459_1
lerc 3.0 h295c915_0
libdeflate 1.17 h5eee18b_1
libffi 3.3 he6710b0_2
libgcc-ng 11.2.0 h1234567_1
libgfortran-ng 7.5.0 ha8ba4b0_17
libgfortran4 7.5.0 ha8ba4b0_17
libgomp 11.2.0 h1234567_1
libiconv 1.16 h7f8727e_2
libidn2 2.3.4 h5eee18b_0
libpng 1.6.39 h5eee18b_0
libsodium 1.0.18 h7b6447c_0
libstdcxx-ng 11.2.0 h1234567_1
libtasn1 4.19.0 h5eee18b_0
libtiff 4.5.1 h6a678d5_0
libunistring 0.9.10 h27cfd23_0
libuuid 1.41.5 h5eee18b_0
libuv 1.44.2 h5eee18b_0
libwebp 1.2.4 h11a3e52_1
libwebp-base 1.2.4 h5eee18b_1
libxcb 1.15 h7f8727e_0
libxml2 2.9.14 h74e7548_0
lz4-c 1.9.4 h6a678d5_0
matplotlib 3.3.4 py36h06a4308_0
matplotlib-base 3.3.4 py36h62a2d02_0
mkl 2020.2 256
mkl-service 2.3.0 py36he8ac12f_0
mkl_fft 1.3.0 py36h54f3939_0
mkl_random 1.1.1 py36h0573a6f_0
ncurses 6.4 h6a678d5_0
nest-asyncio 1.5.1 pyhd3eb1b0_0
nettle 3.7.3 hbbd107a_1
numpy 1.19.2 py36h54aff64_0
numpy-base 1.19.2 py36hfa32c7d_0
olefile 0.46 pyhd3eb1b0_0
openh264 2.1.1 h4ff587b_0
openssl 1.1.1w h7f8727e_0
pandas 1.1.5 py36ha9443f7_0
parso 0.7.0 py_0
pcre 8.45 h295c915_0
pexpect 4.8.0 pyhd3eb1b0_3
pickleshare 0.7.5 pyhd3eb1b0_1003
pillow 8.3.1 py36h5aabda8_0
pip 21.2.2 py36h06a4308_0
prompt-toolkit 3.0.20 pyhd3eb1b0_0
ptyprocess 0.7.0 pyhd3eb1b0_2
pygments 2.11.2 pyhd3eb1b0_0
pyparsing 3.0.4 pyhd3eb1b0_0
pyqt 5.9.2 py36h05f1152_2
python 3.6.13 h12debd9_1
python-dateutil 2.8.2 pyhd3eb1b0_0
pytorch 1.10.2 py3.6_cuda11.3_cudnn8.2.0_0 pytorch
pytorch-mutex 1.0 cuda pytorch
pytz 2021.3 pyhd3eb1b0_0
pyzmq 22.2.1 py36h295c915_1
qt 5.9.7 h5867ecd_1
readline 8.2 h5eee18b_0
scikit-learn 0.24.2 py36ha9443f7_0
scipy 1.5.2 py36h0b6359f_0
setuptools 58.0.4 py36h06a4308_0
sip 4.19.8 py36hf484d3e_0
six 1.16.0 pyhd3eb1b0_1
sqlite 3.41.2 h5eee18b_0
threadpoolctl 2.2.0 pyh0d69192_0
tk 8.6.12 h1ccaba5_0
topaz 0.2.5 py_0 tbepler
torchvision 0.11.3 py36_cu113 pytorch
tornado 6.1 py36h27cfd23_0
traitlets 4.3.3 py36h06a4308_0
typing_extensions 4.1.1 pyh06a4308_0
wcwidth 0.2.5 pyhd3eb1b0_0
wheel 0.37.1 pyhd3eb1b0_0
xz 5.4.5 h5eee18b_0
zeromq 4.3.4 h2531618_0
zlib 1.2.13 h5eee18b_0
zstd 1.5.5 hc292b87_0
Again, the models are ultimately saved and ready to use, but somehow the path variable is not defined.
I believe this is already fixed on main, but we haven't pushed that out in a new version on conda/pip. Have you tried installing from source?
Ah, okay.
No I haven't but I can certainly try this. However, since it still works and I by now I have my denoised micrographs I am good in that regard. This was also to bring this to attention. But yes, I shall try it and get back to you. Thanks for looking into this.
Hi - just a minor thing but the --output flag is missing in the Topaz GUI for training of new denoise models (there is only a --save_prefix flag). It's easy enough to fix, just a bit annoying as training runs through all epochs and only fails to save the output in the very end. BW Sven