bcgsc / NanoSim

Nanopore sequence read simulator
Other
233 stars 56 forks source link

simulator.py gets stuck (training done on NGMLR bam file) #49

Closed sdjebali closed 3 years ago

sdjebali commented 5 years ago

Hello,

I wanted to compare the results of an SV detection program when using minimap2 mapping (minisv) or NGMLR mapping (ngmlrsv) as input, on reads simulated by nanosim.

In order not to favour minisv (minimap2 being nanosim default mapper), I wanted to use nanosim with NGMLR mapping as input.

Although read_analysis.py works fine with an NGMLR bam file, when I run simulator.py asking for 100,000 simulated reads, it gets stuck, each time at a different number of generated reads, usually above 20,000.

I am using nanosim 2.2.0 and the following command lines (2nd step asking for 80G of ram): read_analysis.py -i sub10k.fa -m sub10kreads.genome.aln.sam -r Perca_flavescens.PFLA1.1.dna.toplevel.okseqids.fa.gz -t 4 > read_analysis.out 2> read_analysis.log

simulator.py linear -r Perca_flavescens.PFLA1.1.dna.toplevel.okseqids.fa -n 100000 -c training -o simulating > simulated.out 2> simulated.err

I have put the input files needed for the second step here, let me know if you need anything else? http://genoweb.toulouse.inra.fr/~sdjebali/issues/nanosim/tosend.tar.gz

Best, Sarah

cheny19 commented 4 years ago

I'm so sorry that we missed your issue somehow. If you still have this problem, please try our new release and see if it helps.