Open DerLorenz opened 1 year ago
Hey Lorenz, can you try this again with the latest version? Also, can you share the analyze_landscape_full
command you were using when you got this error?
Hey Michal,
Thanks for the response. I will try it with the newest version by tomorrow and post an update here. I submitted using the following parameters (according to the .out file):
(INFO) (analyze_landscape_full.py) (29-Aug-23 12:21:33) Loaded configuration: {'cmd': ['/path/to/Anaconda3/envs/cryodrgn_v23/bin/cryodrgn', 'train_vae', 'particles.128.ft.txt', '--preprocessed', '--poses', 'pose.pkl', '--ctf', 'ctf.pkl', '--zdim', '8', '-n', '50', '-o', 'vae_128_8', '--multigpu'],
I submitted to our HPC cluster using slurm. Here I allocated a single core and a single GPU with more than enough memory. Maybe the --multigpu made the issue here? Though I did not specified it and I do not even see this option when checking
cryodrgn analyze_landscape --help
Thanks for looking into this!
Hi Michal,
Sorry for being unresponsive for so long. I redid the analysis using cryoDRGN 3.x. I submitted the following command using SLURM to our HPC: cryodrgn analyze_landscape_full vae_128_8 49 --landscape-dir landscape_masked.49 -o landscape_masked.49/landscape_full Following batch parameters were set: sbatch -p g --mem=35G --gres=gpu:1 --time=08:00:00 After generating the volume embeddings the job failedn with the following error:
Traceback (most recent call last): File "/groups/haselbach/software/Anaconda3/envs/cryodrgn_V3x/bin/cryodrgn", line 8, in <module> sys.exit(main()) File "/groups/haselbach/software/Anaconda3/envs/cryodrgn_V3x/lib/python3.9/site-packages/cryodrgn/__main__.py", line 72, in main args.func(args) File "/groups/haselbach/software/Anaconda3/envs/cryodrgn_V3x/lib/python3.9/site-packages/cryodrgn/commands/analyze_landscape_full.py", line 333, in main embeddings_all = train_model(z, embeddings, outdir, zfile, args) File "/groups/haselbach/software/Anaconda3/envs/cryodrgn_V3x/lib/python3.9/site-packages/cryodrgn/commands/analyze_landscape_full.py", line 268, in train_model train(args, model, device, train_loader, optimizer, epoch) File "/groups/haselbach/software/Anaconda3/envs/cryodrgn_V3x/lib/python3.9/site-packages/cryodrgn/commands/analyze_landscape_full.py", line 114, in train for batch_idx, (data, target) in enumerate(train_loader): File "/groups/haselbach/software/Anaconda3/envs/cryodrgn_V3x/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 438, in __iter__ return self._get_iterator() File "/groups/haselbach/software/Anaconda3/envs/cryodrgn_V3x/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 386, in _get_iterator return _MultiProcessingDataLoaderIter(self) File "/groups/haselbach/software/Anaconda3/envs/cryodrgn_V3x/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1084, in __init__ self._reset(loader, first_iter=True) File "/groups/haselbach/software/Anaconda3/envs/cryodrgn_V3x/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1117, in _reset self._try_put_index() File "/groups/haselbach/software/Anaconda3/envs/cryodrgn_V3x/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1351, in _try_put_index index = self._next_index() File "/groups/haselbach/software/Anaconda3/envs/cryodrgn_V3x/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 620, in _next_index return next(self._sampler_iter) # may raise StopIteration File "/groups/haselbach/software/Anaconda3/envs/cryodrgn_V3x/lib/python3.9/site-packages/torch/utils/data/sampler.py", line 282, in __iter__ for idx in self.sampler: File "/groups/haselbach/software/Anaconda3/envs/cryodrgn_V3x/lib/python3.9/site-packages/torch/utils/data/sampler.py", line 164, in __iter__ yield from torch.randperm(n, generator=generator).tolist() RuntimeError: Expected a 'cuda' device type for generator but found 'cpu'
Again pytorch seems to be working fine.
thanks for the update, let me try to replicate this error and I'll get back to you!
Just asking if there is some update here. I still would like to try this tools.
I haven't been able to replicate this error yet, but we are presently working on a new refactored version of this tool, which will hopefully help us resolve issues such as this. We should have a further update by the end of the month!
Any updates?
Still haven't seen this error on our side — have you tried again with the latest version (v3.4.2)?
Hi,
I am using cryodrgn v23 on a HPC-cluster. So far, have ran the complete 'standard' cryodrgn pipeline succesfully as many times before. With my new dataset I was eager to try the landscape analysis. Everything works nicely until I try to run
analyze_landscape_full
After volume generation I get the following error:RuntimeError: Expected a 'cuda' device type for generator but found 'cpu'
I checked my pytorch and everything seems fine:Also, the previous cryodrgn jobs I ran, used cuda/gpus without any issue. I am not sure what could be the issue here and any help would be very much appreciated.
Best, Lorenz