morispi / CONSENT

Scalable long read self-correction and assembly polishing with multiple sequence alignment
https://doi.org/10.1038/s41598-020-80757-5
GNU Affero General Public License v3.0
55 stars 5 forks source link

terminate called after throwing an instance of 'std::out_of_range' #7

Closed novikk closed 5 years ago

novikk commented 5 years ago

Getting this error on the "correcting the long reads" step:

[M::mm_idx_gen::22.928*1.52] collected minimizers
[M::mm_idx_gen::25.269*2.72] sorted minimizers
[M::main::25.270*2.72] loaded/built the index for 456779 target sequence(s)
[M::mm_mapopt_update::28.654*2.52] mid_occ = 283
[M::mm_idx_stat] kmer size: 15; skip: 5; is_hpc: 0; #seq: 456779
[M::mm_idx_stat::30.533*2.43] distinct minimizers: 73328105 (64.06% are singletons); average occurrences: 2.303; average spacing: 2.960
[M::worker_pipeline::294.408*7.22] mapped 456772 sequences
[M::worker_pipeline::458.370*4.73] mapped 293826 sequences
[M::mm_idx_gen::474.209*4.63] collected minimizers
[M::mm_idx_gen::475.271*4.67] sorted minimizers
[M::main::475.272*4.67] loaded/built the index for 293819 target sequence(s)
[M::mm_mapopt_update::475.272*4.67] mid_occ = 283
[M::mm_idx_stat] kmer size: 15; skip: 5; is_hpc: 0; #seq: 293819
[M::mm_idx_stat::475.851*4.67] distinct minimizers: 48724457 (70.21% are singletons); average occurrences: 2.138; average spacing: 2.960
[M::worker_pipeline::749.425*5.49] mapped 456772 sequences
[M::worker_pipeline::926.878*4.49] mapped 293826 sequences
[M::main] Version: 2.14-r894-dirty
[M::main] CMD: /genomics/users/irubia/tools/CONSENT/minimap2/minimap2 -k15 -w5 -m100 -g10000 -r2000 --max-chain-skip 25 --dual=yes -PD --no-long-join -t24 -I500M /genomics/users/irubia/runs/Hopkins_1_new_iso/Hopkins1.gte500.renamed.fa /genomics/users/irubia/runs/Hopkins_1_new_iso/Hopkins1.gte500.renamed.fa
[M::main] Real time: 928.346 sec; CPU: 4164.746 sec; Peak RSS: 10.984 GB
[Tue Feb 19 16:39:57 CET 2019] Correcting the long reads
terminate called after throwing an instance of 'std::out_of_range'
  what():  stoi
/genomics/users/irubia/tools/CONSENT/CONSENT-correct: line 181:  4891 Aborted                 $LRSCf/bin/CONSENT -i $tmpdir/"$PAFIndex" -a $tmpdir/"$alignments" -s "$minSupport" -S "$maxSupport" -l "$windowSize" -k "$merSize" -c "$commonKMers" -A "$minAnchors" -f "$solid" -m "$windowOverlap" -j "$nproc" -r "$reads" -M "$maxMSA" -p "$LRSCf" >> "$out"

Dataset is cDNA from the Nanopore WGS Consortium (https://github.com/nanopore-wgs-consortium/NA12878). I filtered the dataset so that the shortest read is 500bp. Tried running with default parameters, with --windowSize 500 and --windowSize 499

morispi commented 5 years ago

Hi,

Dowloading the data and running ASAP. My internet connection is pretty low at home, so can't promise I'll have the time to do it tonight. Should be okay by tomorrow.

I'll keep you updated.

Are you running on the two datasets at once or only one of them though?

Pierre

novikk commented 5 years ago

Hi,

I'm running it only on Hopkins Run1 cDNA pass data, but it seems like they've merged all the runs together.

May I send you an email with the dataset?

Ivan

morispi commented 5 years ago

Hi,

You sure can. My email is: pierre[dot]morisse2[at]univ-rouen[dot]fr

Pierre

novikk commented 5 years ago

Just sent the file via WeTransfer

morispi commented 5 years ago

Fixed

novikk commented 5 years ago

Worked perfectly, thanks!