PacificBiosciences / FALCON

FALCON: experimental PacBio diploid assembler -- Out-of-date -- Please use a binary release: https://github.com/PacificBiosciences/FALCON_unzip/wiki/Binaries
https://github.com/PacificBiosciences/FALCON_unzip/wiki/Binaries
Other
205 stars 102 forks source link

fc_run.py was killed in SLRUM system occasionally #376

Open tangerzhang opened 8 years ago

tangerzhang commented 8 years ago

Hello, I was running FALCON in SLRUM system, however the main program fc_run.py was killed sometimes. I did not find any error information. Here is part of output of fc_run.py:

[INFO]# of tasks in complete graph: 13
[INFO]tick: 1, #updatedTasks: 0, sleep_time=0.000000
[INFO]Running task from function task_run_consensus()
[INFO]Running task from function task_run_consensus()
[INFO](slurm) '/WORK/fafu_xtzhang_1/falcon_test/0-rawreads/preads/c_00001.sh'
[INFO]Running task from function task_run_consensus()
[INFO]Queued 'task://localhost/ct_00001' ...
[INFO]Queued 'task://localhost/ct_00002' ...
[INFO]Queued 'task://localhost/ct_00003' ...
[INFO]tick: 2, #updatedTasks: 3, sleep_time=0.000000
[INFO](slurm) '/WORK/fafu_xtzhang_1/falcon_test/0-rawreads/preads/c_00002.sh'
[INFO](slurm) '/WORK/fafu_xtzhang_1/falcon_test/0-rawreads/preads/c_00003.sh'
Submitted batch job 12724
Submitted batch job 12725
[INFO]tick: 4, #updatedTasks: 3, sleep_time=0.200000
Submitted batch job 12726
[INFO]tick: 8, #updatedTasks: 3, sleep_time=0.600000
[INFO]tick: 16, #updatedTasks: 3, sleep_time=1.000000
[INFO]tick: 32, #updatedTasks: 3, sleep_time=1.000000
[INFO]tick: 64, #updatedTasks: 3, sleep_time=1.000000
Killed 

Any suggestions? Thanks!

pb-cdunn commented 8 years ago

No idea. It's hard to know why a job is killed. You'd have to investigate at your end.

However, you can simply restart the workflow. Completely tasks will not be re-run.

If you have the latest FALCON, then you should have set use_tmpdir = true (so that partial results do not appear in your NFS directories) before the first run. Then, to re-run, simply rm -rf mypwatcher/ (which watches running processes in the Grid) and run as before.