biocorecrg / MOP2

Master of Pores 2
https://biocorecrg.github.io/MOP2/docs/
MIT License
22 stars 7 forks source link

compore_polish_flow:mean_per_pos terminates with error exit status 1 #15

Open peaceben opened 2 years ago

peaceben commented 2 years ago

Hi,

I'm running MOP2 on awsbatch with a relatively minimalistic input setup for testing purposes. I have matching wildtype and invitro DRS data (1 rep each) and i defined a reference for only one gene (which is also covered by the in vitro transcribed data).

While i get some output for epinanoflow. The pipeline terminates with mentioned error. Here's the statement from the log file:

_Error executing process > 'compore_polish_flow:mean_per_pos (invitro)'

Caused by: Process compore_polish_flow:mean_per_pos (invitro) terminated with an error exit status (1)

Command executed:

mean_per_pos.py -i invitro_AIE407_pass_720e7cf3_16.fast5_event_align.tsv.gz -o basename invitro_AIE407_pass_720e7cf3_16.fast5_event_align.tsv.gz .fast5_event_align.tsv.gz

gzip *_processed_perpos_median.tsv

Command exit status: 1

Command output: (empty)

Command error: thread '' panicked at 'no lines in the file', /github/workspace/polars/polars-io/src/csv_core/parser.rs:108:51 note: run with RUST_BACKTRACE=1 environment variable to display a backtrace Traceback (most recent call last): File "/project/nextflow-bin/mean_per_pos.py", line 145, in main() File "/project/nextflow-bin/mean_per_pos.py", line 127, in main raw_import = parse_input(a.input) File "/project/nextflow-bin/mean_per_pos.py", line 34, in parse_input "event_level_mean": pl.Float32 }) File "/usr/local/python/versions/3.6.3/lib/python3.6/site-packages/polars/io.py", line 405, in read_csv rechunk=rechunk, File "/usr/local/python/versions/3.6.3/lib/python3.6/site-packages/polars/internals/frame.py", line 511, in read_csv parse_dates, pyo3runtime.PanicException: no lines in the file

I've also attatched the command.log and .err files from corresponding work directory, as well as the complete log file. When looking at the previous steps and the fast5_event_align.tsv.gz input file i observed that this file is empty.

Could you give me any clues where the error might stem from? This is the first time I'm using awsbatch and one thing i should mention is that i didn't configured a queue with a gpu-based CE yet, even though the queue is stated in the awsbatch profile.

Any tips probably will help.

Best, Ben

logs.zip

lucacozzuto commented 2 years ago

Dear @peaceben, can you please try with the embedded test dataset? And I just fixed a problem in mop_mod ?(some process were commented), can you do a git pull?

best, Luca

peaceben commented 2 years ago

Dear Luca,

I have updated the repository yesterday and gave the test data a go. Preprocessing works flawlessy. mop_mod exits with following statement:

**_Error executing process > 'compore_polish_flow:NANOPOLISH_EVENTALIGN:eventalignCollapse (mod)'

Caused by: Process compore_polish_flow:NANOPOLISH_EVENTALIGN:eventalignCollapse (mod) terminated with an error exit status (1)

Command executed:

zcat mod_batch_0.fast5_event_align.tsv.gz | awk '!(/^contig/ && NR>1)' | tee >(pigz -p 2 -9 - > mod_combined.eventalign.tsv.gz) | NanopolishComp Eventalign_collapse -t 2 -o mod_collapsed_align_events

Command exit status: 1

Command output: (empty)

Command error: Checking arguments Traceback (most recent call last): File "/usr/local/python/versions/3.6.3/bin/NanopolishComp", line 8, in sys.exit(main()) File "/usr/local/python/versions/3.6.3/lib/python3.6/site-packages/NanopolishComp/main.py", line 65, in main args.func(args) File "/usr/local/python/versions/3.6.3/lib/python3.6/site-packages/NanopolishComp/main.py", line 80, in Eventalign_collapse_main quiet = args.quiet) File "/usr/local/python/versions/3.6.3/lib/python3.6/site-packages/NanopolishComp/Eventaligncollapse.py", line 106, in init raise ValueError ("At least 3 threads required") ValueError: At least 3 threads required**

Additionally, bedGraphToWig_msc (mod---wt) & bedGraphToWig_lsc (mod---wt) fail 2 out of 4, but the error is ignored. I think this is due to the fact that there is no data on the negative strand.

I've attatched the logs again.

Best, Ben

logs_2.zip

lucacozzuto commented 2 years ago

Hi, thisis because you need at least 3 threads for nanocompore...

ValueError: At least 3 threads required_**

I increased the number of cpus in awsbatch.config

can you give it a try again after git pull?

peaceben commented 2 years ago

Hi Luca,

after git pull mop_mod finishes successfully, but without tombo results (& without output_book_mod):

[9d/cd437b] process > checkRef (Checking yeast_rRNA_ref.fa.gz) [100%] 1 of 1 ✔ [7c/4823df] process > epinano_flow:splitReference (Splitting of reference.fa) [100%] 1 of 1 ✔ [8e/7069a5] process > epinano_flow:splitBams (Splitting of mod_s.bam on pieces00.fa) [100%] 2 of 2 ✔ [54/ac7ff3] process > epinano_flow:indexReference (Indexing pieces00.fa) [100%] 1 of 1 ✔ [6e/3639c3] process > epinano_flow:EPINANO_CALC_VAR_FREQUENCIES (mod_pieces00_s.bam on mod) [100%] 2 of 2 ✔ [fb/159e11] process > epinano_flow:joinEpinanoRes (joining on mod) [100%] 2 of 2 ✔ [d2/6edf2b] process > epinano_flow:makeEpinanoPlots_ins (mod--wt ins) [100%] 1 of 1 ✔ [eb/545189] process > epinano_flow:makeEpinanoPlots_mis (mod--wt mis) [100%] 1 of 1 ✔ [9c/0e7b3e] process > epinano_flow:makeEpinanoPlots_del (mod--wt del) [100%] 1 of 1 ✔ [95/1661e7] process > compore_polish_flow:getChromInfo (reference.fa) [100%] 1 of 1 ✔ [fe/169b5b] process > compore_polish_flow:NANOPOLISH_EVENTALIGN:index (mod) [100%] 2 of 2 ✔ [bf/08e8a6] process > compore_polish_flow:NANOPOLISH_EVENTALIGN:eventalign (wt--batch_0.fast5) [100%] 2 of 2 ✔ [12/93cc9d] process > compore_polish_flow:NANOPOLISH_EVENTALIGN:eventalignCollapse (wt) [100%] 2 of 2 ✔ [0d/7d92d5] process > compore_polish_flow:mean_per_pos (wt) [100%] 2 of 2 ✔ [4a/fce05e] process > compore_polish_flow:concat_mean_per_pos (mod on chrom.1.sizes) [100%] 2 of 2 ✔ [1a/5c95c3] process > compore_polish_flow:concat_csv_files (mod) [100%] 2 of 2 ✔ [1c/87df9d] process > compore_polish_flow:NANOCOMPORE_SAMPLE_COMPARE:sampleCompare (mod vs wt) [100%] 1 of 1 ✔ [6a/774974] process > tombo_common_flow:multiToSingleFast5 (mod___batch_0) [100%] 2 of 2 ✔ [09/8671a2] process > tombo_common_flow:TOMBO_RESQUIGGLE_RNA:resquigglerna (modbatch_0) [100%] 2 of 2 ✔ [9f/f32c49] process > getChromInfo (reference.fa) [100%] 1 of 1 ✔ [99/dd4f61] process > tombo_msc_flow:TOMBO_GET_MODIFICATION_MSC:getModificationsWithModelSampleCompare (mod vs wt) [100%] 1 of 1 ✔ [23/e796b4] process > bedGraphToWig_msc (mod---wt) [100%] 4 of 4, failed: 2 ✔ [1a/802ccb] process > tombo_lsc_flow:TOMBO_GET_MODIFICATION_LSC:getModificationsWithLevelSampleCompare (mod vs wt) [100%] 1 of 1 ✔ [7f/c93b3f] process > bedGraphToWig_lsc (mod---wt) [100%] 4 of 4, failed: 2 ✔ [73/379c0c] process > wigToBigWig (mod---wt_lsc) [100%] 4 of 4 ✔ [- ] process > mergeTomboWigsPlus - [- ] process > mergeTomboWigsMinus - [5c/f627b7] process > EPINANO_VER:getVersion [100%] 1 of 1 ✔ [69/843bca] process > NANOPOLISH_VER:getVersion [100%] 1 of 1 ✔ [b4/f6121c] process > NANOCOMPORE_VER:getVersion [100%] 1 of 1 ✔ [42/18cc4f] process > TOMBO_VER:getVersion [100%] 1 of 1 ✔ Completed at: 29-März-2022 18:02:19 Duration : 14m 17s CPU hours : 0.5 (1,7% failed) Succeeded : 45 Ignored : 4 Failed : 4

Best, Ben

lucacozzuto commented 2 years ago

Dear Ben, I tried mop_mod in aws and I got actually an error but with nanopolish indexing... I'm trying to understand the origin of this that seems to be AWS related in some way. For the moment I recommend you not using mop_mod on the AWS... Sorry for this, I'll try to figure it out what is happening

peaceben commented 2 years ago

Hey Luca,

thanks for checking on this. I ran the pipeline again on the test data but in local mode and got the following output (similar as noted above for test data on awsbatch).

_executor > local (51) [18/70e124] process > checkRef (Checking yeast_rRNA_ref.fa.gz) [100%] 1 of 1 ✔ [dd/a950b2] process > epinano_flow:splitReference (Splitting of reference.fa) [100%] 1 of 1 ✔ [5b/9de7fc] process > epinano_flow:splitBams (Splitting of mod_s.bam on pieces00.fa) [100%] 2 of 2 ✔ [fc/31a067] process > epinano_flow:indexReference (Indexing pieces00.fa) [100%] 1 of 1 ✔ [9b/dc81bd] process > epinano_flow:EPINANO_CALC_VAR_FREQUENCIES (mod_pieces00_s.bam on mod) [100%] 2 of 2 ✔ [5f/3e5660] process > epinano_flow:joinEpinanoRes (joining on wt) [100%] 2 of 2 ✔ [78/d171be] process > epinano_flow:makeEpinanoPlots_ins (mod--wt ins) [100%] 1 of 1 ✔ [1b/ddcf68] process > epinano_flow:makeEpinanoPlots_mis (mod--wt mis) [100%] 1 of 1 ✔ [b1/ba08c2] process > epinano_flow:makeEpinanoPlots_del (mod--wt del) [100%] 1 of 1 ✔ [67/75a61f] process > compore_polish_flow:getChromInfo (reference.fa) [100%] 1 of 1 ✔ [91/1f71b7] process > compore_polish_flow:NANOPOLISH_EVENTALIGN:index (wt) [100%] 2 of 2 ✔ [9a/f76ec7] process > compore_polish_flow:NANOPOLISH_EVENTALIGN:eventalign (wt--batch_0.fast5) [100%] 2 of 2 ✔ [54/b921f9] process > compore_polish_flow:NANOPOLISH_EVENTALIGN:eventalignCollapse (mod) [100%] 2 of 2 ✔ [1b/d3b16c] process > compore_polish_flow:mean_per_pos (wt) [100%] 2 of 2 ✔ [f0/c8b4c2] process > compore_polish_flow:concat_mean_per_pos (wt on chrom.1.sizes) [100%] 2 of 2 ✔ [6a/9ba0fb] process > compore_polish_flow:concat_csv_files (wt) [100%] 2 of 2 ✔ [3d/2714e7] process > compore_polish_flow:NANOCOMPORE_SAMPLE_COMPARE:sampleCompare (mod vs wt) [100%] 1 of 1 ✔ [b2/ca3b1e] process > tombo_common_flow:multiToSingleFast5 (wt___batch_0) [100%] 2 of 2 ✔ [09/3aac9d] process > tombo_common_flow:TOMBO_RESQUIGGLE_RNA:resquigglerna (wtbatch_0) [100%] 2 of 2 ✔ [fa/2a2051] process > getChromInfo (reference.fa) [100%] 1 of 1 ✔ [2f/330e29] process > tombo_msc_flow:TOMBO_GET_MODIFICATION_MSC:getModificationsWithModelSampleCompare (mod vs wt) [100%] 1 of 1 ✔ [8e/233345] process > bedGraphToWig_msc (mod---wt) [100%] 4 of 4, failed: 2 ✔ [ca/8b0651] process > tombo_lsc_flow:TOMBO_GET_MODIFICATION_LSC:getModificationsWithLevelSampleCompare (mod vs wt) [100%] 1 of 1 ✔ [6c/ceb97b] process > bedGraphToWig_lsc (mod---wt) [100%] 4 of 4, failed: 2 ✔ [3c/eeb70f] process > wigToBigWig (mod---wt_msc) [100%] 4 of 4 ✔ [15/6dc317] process > mergeTomboWigsPlus (mod---wt_msc) [100%] 2 of 2 ✔ [- ] process > mergeTomboWigsMinus - [a5/9a9246] process > EPINANO_VER:getVersion [100%] 1 of 1 ✔ [98/b7c6e4] process > NANOPOLISH_VER:getVersion [100%] 1 of 1 ✔ [66/517954] process > NANOCOMPORE_VER:getVersion [100%] 1 of 1 ✔ [38/2970f0] process > TOMBOVER:getVersion [100%] 1 of 1 ✔ Completed at: 31-März-2022 11:17:33 Duration : 5m 35s CPU hours : 0.5 (0,5% failed) Succeeded : 47 Ignored : 4 Failed : 4

Are the ignored/failed tombo processes expected for the test data? Wanted to ask before running the pipeline on my data locally.

Best, Ben

lucacozzuto commented 2 years ago

Hi, yes some of them fail because on the wrong strand so they are empty. I think for local you need to raise the number of CPUs to 3 since nanopolish needs at least 3 cpus for working.

lucacozzuto commented 2 years ago

Hi, I fixed the problem of nanopolish / nanocompore but not the Tombo one. I tried several different things but I'm quite stuck here. I'll keep this open.

peaceben commented 2 years ago

Hey Luca,

cheers! I'll git pull and check with my data again.