Open subhrob15 opened 1 year ago
Please respect our issue template.
You didn't provide essential information (i.e. the version of RELION, full command line etc) and you used a screenshot only to show error messages in text.
I am really sorry, I wasn't aware of this. Job options: Relion version: 3.1.3 and 4.0.0 Type of job: Bayesian Polishing Number of MPI processes: with 3, 5 or 6 gave the same issue Number of threads: 1
The command used: which relion_motion_refine_mpi
--i run_data.star --f PostProcess/job001/postprocess.star --corr_mic corrected_micrographs.star --first_frame 1 --last_frame -1 --o Polish/job016/ --params_file Polish/job002/opt_params_all_groups.txt --combine_frames --bfac_minfreq 20 --bfac_maxfreq -1 --only_do_unfinished --j 12 --pipeline_control Polish/job016
This error message only originates if it is run on the mpi procs. For the following command the job works but again takes a lot of time.
which relion_motion_refine
--i Refine3D/job052/run_data_inverted.star --f PostProcess/job061/postprocess.star --corr_mic MotionCorr/job028/corrected_micrographs.star --first_frame 1 --last_frame -1 --o Polish/job154/ --float16 --params_file Polish/job150/opt_params_all_groups.txt --combine_frames --bfac_minfreq 20 --bfac_maxfreq -1 --only_do_unfinished --j 9 --pipeline_control Polish/job154/
Thanks
if I am running it locally this issue is not coming
Does it go to completion albeit slower?
Yes it does go to completion but it takes more than a week to complete, which is very very slow considering that there are only 1200 micrographs
My initial guess was that one or more movies had corrupted motion STAR files. But that would kill non-MPI jobs as well. So this hypothesis is not correct.
Does this happen on all datasets you process on this machine? Was the dataset motion-corrected by RELION's implementation or UCSF MotionCor2?
the dataset was motion corrected using RELION's implementation only. I forgot to mention, that the dataset was initial processed in cryosparc and converted to RELION format using cssparc2star.py. After doing that, I also performed a re-extraction job in RELION to see if the coordinates are correct and using the re-extracted particles I performed the polishing. This worked when I ran it locally and not on the cluster and the map quality also improved. But when I ran it on the MPI procs it gave this error.
What happens if you Polish only one movie? Does it occur on any movie?
Hey, I tried it with only one movie and it did work on the MPI proc
Can you find the offending movie?
Sorry I did not understand what you mean by offending
Because one movie was fine, probably not all movies are bad. Please find which movie(s) cause crash.
Okay, but could you suggest a better way to go through all the movies instead of going over one by one
If there is only one problematic movie, you can use binary search. That is, split the dataset into half. If the first half is successful, the latter half contains the bad movie. Split the latter half into two and repeat the procedure.
Hey, Whenever I am trying to run the the polish part of bayesian polishing the job is exiting with an error saying incomplete third order polynomial values. I have attached a screenshot of the same. This is only happening when I am running it on the MPI procs but if I am running it locally this issue is not coming but then the whole job takes a lot of time. Could someone indicate how to fix this issue. Thanks