Closed martinpacesa closed 3 years ago
Here is the output of error_dump_pdf_offset:
0 8.50363e-05 6.07781e-05 4.24327e-05 7.99523e-05 6.39454e-05 0.000185883 0.000140132 0.000138112 0.000232692 0.000219286 0.000261309 0.000203861 0.000212719 0.000240111 0.000138939 0.000170298 0.000211036 0.000221938 0.000254487 0.000191029 0.000278778 0.000313391 0.000248341 0.000203394 0.000288252 0.000290926 0.00033905 0.000304973 0.000313837 0.000370047 0.000349922 0.000342475 0.000363883 0.000444821 0.000379072 0.000430447
Try Class2D on shiny particles; sometimes detector artifacts concealed during initial motion correction (e.g. hot pixels and dead lines) come back after Polish.
Box size: 384 px Pixel size: 0.68 Å/px
Although this is not the cause of your problem, working in such a small pixel is waste of storage and processing time unless you are expecting 1.4 Å. Down-sample to a reasonable pixel size during Extraction and Polish.
Thank you for your answer, I apologise, I googled for a while to find a similar issue, but didn't come accross the FAQ. I will try both rerunning the Bayesian polishing with different values and 2D classification, and report back.
Reclassification solved the issue and improved the map to 3.3A! Thank you!
After this, I did another round of aberration/beamtilt refinement, anisotropy and per particle CTF refinement. After that I tried auto-refine and got the same problem, the issue renders the server GPUs unusable and the machine has to be physically unplugged to make it recognise them again. 2D reclassification fixes this. Is there a way to add a check for such corrupted data?
The GPU issue has nothing to do with RELION. It is more of kernel and driver's issue.
This issue keeps persisting with a particular dataset we have, even after doing 2D reclassification this time. Same error as before. The 3D auto refinement keeps hanging without any error, but when I look at the processes in the command line I see that UVM_GPU3_BH process appears and that the GPUs are no longer recognised by the system when I check nvidia-smi. I have tried this on 3 separate machines and it happens everywhere. Our current nvidia driver version is 450.80.02 and CUDA 11.0. I just noticed our current relion version running through SBGrid is 3.1.1_cu9.2 rather than the default 3.1.1, could this cause the problems?
I forgot to mention, I had this error on 3 separate datasets already
Hello!
I am at the last step of 3D refinement of my map (170k particle dataset, going to 3.7A). At this point, I have performed Bayesian training with 40k particles (overkill) and got parameter values of 1.014 8130 1.92. When I perform training with 5k particles I get values of 1.128 7875 2.13. I have proceeded with polishing using the 40k training values and then used the shiny.star output for 3D refinement using a mask and model obtained from previous rounds of refinement.
I have tried the refinement three times already on 2 different processing machines, with either 2 GPUs-3MPIs-4threads or 4GPUs-5MPIs-10threads and each time the refinement gives me the following error during the 10th iteration of the refinement. I used the model and mask previously for refinements, the only difference this time is that I am using FSC solvent flattening and the shiny particles. Do you suggest to rerun the polishing with the 5k trained values?