PacificBiosciences / FALCON

FALCON: experimental PacBio diploid assembler -- Out-of-date -- Please use a binary release: https://github.com/PacificBiosciences/FALCON_unzip/wiki/Binaries
https://github.com/PacificBiosciences/FALCON_unzip/wiki/Binaries
Other
205 stars 102 forks source link

KeyError during falcon_kit.stats_preassembly job #483

Open yingzhang121 opened 7 years ago

yingzhang121 commented 7 years ago

Hi, Developer,

I had a very weird error.

When I ran Falcon on a set of 8 SMRT cells, job done, no problem. However, when I ran Falcon on 25 SMRT cells (including the previous 8), the job got killed after daligner step.

So I looked into the stderr and stdout files, and found out the real error message is following:

… … … /panfs/roc/scratch/kianians/falcon_unzip_test/0-rawreads/preads/cns_00102/cns_00102.fasta /panfs/roc/scratch/kianians/falcon_unzip_test/0-rawreads/preads/cns_00103/cns_00103.fasta /panfs/roc/scratch/kianians/falcon_unzip_test/0-rawreads/preads/cns_00104/cns_00104.fasta /panfs/roc/scratch/kianians/falcon_unzip_test/0-rawreads/preads/cns_00105/cns_00105.fasta > [4077]$ DBdump -h /panfs/roc/scratch/kianians/falcon_unzip_test/0-rawreads/raw_reads.db > ERROR:falcon_kit.stats_preassembly:Using arbitrary truncation metric: -1.0 Traceback (most recent call last): File "/panfs/roc/itascasoft/pacificbiosciences-falcon/FALCON-integrate/0.7/FALCON-integrate/FALCON/falcon_kit/stats_preassembly.py", line 210, in calc_dict trunc = metric_truncation(i_raw_reads_db_fn, i_preads_fofn_fn) File "/panfs/roc/itascasoft/pacificbiosciences-falcon/FALCON-integrate/0.7/FALCON-integrate/FALCON/falcon_kit/stats_preassembly.py", line 133, in metric_truncation return functional.calc_metric_truncation(dbdump_output, length_pairs_output) File "/panfs/roc/itascasoft/pacificbiosciences-falcon/FALCON-integrate/0.7/FALCON-integrate/FALCON/falcon_kit/functional.py", line 307, in calc_metric_truncation avg = -average_difference(pread_lengths, orig_lengths) File "/panfs/roc/itascasoft/pacificbiosciences-falcon/FALCON-integrate/0.7/FALCON-integrate/FALCON/falcon_kit/functional.py", line 292, in average_difference vb = dictB[k] KeyError: 0 [4077]$ DBdump -h /panfs/roc/scratch/kianians/falcon_unzip_test/0-rawreads/raw_reads.db > INFO:falcon_kit.stats_preassembly:stats for raw reads: FastaStats(nreads=3186311, total=20917694606, n50=8450, p95=13368) INFO:falcon_kit.stats_preassembly:stats for seed reads: FastaStats(nreads=823088, total=9600369423, n50=11212, p95=17825) ... ...

Then I went back to check the job log from my cluster. It turned out the preassembly report job just ran for 1 min 51 seconds b.

PBS Job Id: 2678307.mesabim3.msi.umn.edu Job Name: Jb73e8dacb75e47 Exec host: cn0659/6 Aborted by PBS Server Job exceeded its walltime limit. Job was aborted See Administrator for help Exit_status=-11 resources_used.cput=00:00:27 resources_used.energy_used=0 resources_used.mem=80252kb resources_used.vmem=955868kb resources_used.walltime=00:01:51

I and my colleagues guesses there might be a "clock" set in the Falcon script that only requests 1 minutes for the report job. However, the "preassembly report" job needed more time on more SMRT cells. But we could be wrong.

Could you look into this issue?

Best, Ying

pb-cdunn commented 7 years ago

I and my colleagues guesses there might be a "clock" set in the Falcon script that only requests 1 minutes for the report job.

Not specific to that step, but a recent change makes all steps run in the job-queue. No work is done in the main process anymore. And sge_option is the default for all steps lacking a specific settings, like sge_option_da. So maybe you need to set sge_option.

Check your logs. You should be able to find the qsub command for the report, and you should be able to submit it yourself and see whether you can repeat this problem.

yingzhang121 commented 7 years ago

This makes sense, because the major difference between SGE and Torque is the time thing. I'll set the SGE option in the cfg file.

yingzhang121 commented 7 years ago

OK, after I added the sge_option to the fc_run.cfg file, the entire process got stuck at the "preassembly report" step. I had to manually kill the falcon process, and push the perassembly report before I could continue to the next step.

So apparently, the issue seems more complicated than the sge option.

pb-cdunn commented 7 years ago

Is this on the latest FALCON-integrate?

It should be possible to find error-output to explain the problem you have in the report. Maybe in pwatcher.dir/stderr in the report-run-dir? It's difficult to explain from here.

The KeyError in the report happens when there are no preads. I have no idea what could cause it to hang. Torque job-submission problems?