tbepler / topaz

Pipeline for particle picking in cryo-electron microscopy images using convolutional neural networks trained from positive and unlabeled examples. Also featuring micrograph and tomogram denoising with DNNs.
GNU General Public License v3.0
170 stars 62 forks source link

Topaz training error in multiprocessing data loader #173

Open tbepler opened 1 year ago

tbepler commented 1 year ago

Discussed in https://github.com/tbepler/topaz/discussions/172

Originally posted by **wuhucryoem** July 24, 2023 When I do a topaz training, it show me there haven't some file or directory, but don't show me the concrete file or directory.Like that: Traceback (most recent call last): File "/home/amax/miniconda3/envs/topaz/bin/topaz", line 33, in sys.exit(load_entry_point('topaz-em==0.2.5', 'console_scripts', 'topaz')()) File "/home/amax/miniconda3/envs/topaz/lib/python3.6/site-packages/topaz/main.py", line 148, in main args.func(args) File "/home/amax/miniconda3/envs/topaz/lib/python3.6/site-packages/topaz/commands/train.py", line 695, in main , save_prefix=save_prefix, use_cuda=use_cuda, output=output) File "/home/amax/miniconda3/envs/topaz/lib/python3.6/site-packages/topaz/commands/train.py", line 577, in fit_epochs , use_cuda=use_cuda, output=output) File "/home/amax/miniconda3/envs/topaz/lib/python3.6/site-packages/topaz/commands/train.py", line 552, in fit_epoch for X,Y in data_iterator: File "/home/amax/miniconda3/envs/topaz/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 521, in __next__ data = self._next_data() File "/home/amax/miniconda3/envs/topaz/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1186, in _next_data idx, data = self._get_data() File "/home/amax/miniconda3/envs/topaz/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1152, in _get_data success, data = self._try_get_data() File "/home/amax/miniconda3/envs/topaz/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 990, in _try_get_data data = self._data_queue.get(timeout=timeout) File "/home/amax/miniconda3/envs/topaz/lib/python3.6/multiprocessing/queues.py", line 113, in get return _ForkingPickler.loads(res) File "/home/amax/miniconda3/envs/topaz/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 289, in rebuild_storage_fd fd = df.detach() File "/home/amax/miniconda3/envs/topaz/lib/python3.6/multiprocessing/resource_sharer.py", line 57, in detach with _resource_sharer.get_connection(self._id) as conn: File "/home/amax/miniconda3/envs/topaz/lib/python3.6/multiprocessing/resource_sharer.py", line 87, in get_connection c = Client(address, authkey=process.current_process().authkey) File "/home/amax/miniconda3/envs/topaz/lib/python3.6/multiprocessing/connection.py", line 487, in Client c = SocketClient(address) File "/home/amax/miniconda3/envs/topaz/lib/python3.6/multiprocessing/connection.py", line 614, in SocketClient s.connect(address) **FileNotFoundError: [Errno 2] No such file or directory**