thuem / THUNDER

A particle-filter framework for robust cryoEM 3D reconstruction
GNU General Public License v2.0
56 stars 10 forks source link

How to continue a crashed run #8

Open papaig opened 5 years ago

papaig commented 5 years ago

Dear developers, I'm running Thunder on a cpu cluster. Despite using 8 nodes with 16 cores each, my run didn't finish in 14 days. unfortunately this 14 days is the time limit for a run on this cluster, so the run was cancelled. I would like to continue it and I wonder how I should do it. If I put the last .thu file in the json as ".thu File Storing Paths and CTFs of Images", Thunder seems to restart the run from the beginning. I chose a new folder for the output not to overwrite the files from the previous, crashed run. I would be grateful if you could tell me how to continue the run from where it crashed. Thank you, Gabor

thuem commented 5 years ago

I am so sorry for the delay, as the E-mail system blocked the notification letter.

There are two situations.

First situation, you ended during global search. In this case, just put last .thu fie as ".thu File Storing Paths and CTFs of Images", and change the initial model and initial resolution to the reference / resolution you achieve in the last round of the previous round, respectively. As it should work.

Second situation, you ended after global search. In this case, despite the actions in the first situation, the "Global Search" option should be turned from true to false.

Moreover, THUNDER uses cluster resource in the way different from RELION. If you were using 8 nodes with 16 cores each, it is better to run 1 process with 16 threads on each node. Moreover, as some job managing system such as LFS restricts the number of physic cores assigned to each process, I believe that it is important to check the configuration of the job managing system, making sure that one node runs one process of THUNDER, and this process can use all CPU resource by threading. If this method does not accelerate your job, please contract us and inform us with your job information, such as number of images, boxsize and symmetry. We will compare it with our benchmark.

Best regards.