Open walidabualafia opened 9 months ago
I have a user who ran a very long job, which exits with a failed to create cufft plan. I am not sure what is causing this issue. Most functionality and behavior is correct, and this error just came up while the user was running relion.
Does this happen always for the particular user? What happens if the user continues the failed job?
Box size: 720 px Pixel size: 0.6485 Å/px
Unless the resolution is near 1.3 A, down-sample the particles. This is wasting the storage and processing power.
I have not had any other users report this issue. I also asked around, and no users have seen it either.
This user encountered the error on 7 different jobs, which do not all contain the same particles. Whenever she hit the error, her batch job would preempt and exit. I'm not sure she is able to continue running the job. She did not encounter the error when she ran her job with version 4.0.1-commit-7809a7.
Considering that A100 has a huge VRAM, it is not very likely that the program ran out of memory. Nonetheless it is worth trying down-sampled particles. I am sure the user does not need 0.6485 Å/px. With a more reasonable pixel size, the box size would be smaller, using less memory and leading to faster processing.
This is a template for reporting bugs. Please fill in as much information as you can.
I have been using the relion/5.0-beta for a while now. I have a user who ran a very long job, which exits with a
failed to create cufft plan
. I am not sure what is causing this issue. Most functionality and behavior is correct, and this error just came up while the user was running relion.Environment:
Dataset:
Job options:
note.txt
in the job directory):Error message:
Please cite the full error message as the example below.