I wanted to compare the results of an SV detection program when using minimap2 mapping (minisv) or NGMLR mapping (ngmlrsv) as input, on reads simulated by nanosim.
In order not to favour minisv (minimap2 being nanosim default mapper), I wanted to use nanosim with NGMLR mapping as input.
Although read_analysis.py works fine with an NGMLR bam file, when I run simulator.py asking for 100,000 simulated reads, it gets stuck, each time at a different number of generated reads, usually above 20,000.
I am using nanosim 2.2.0 and the following command lines (2nd step asking for 80G of ram):
read_analysis.py -i sub10k.fa -m sub10kreads.genome.aln.sam -r Perca_flavescens.PFLA1.1.dna.toplevel.okseqids.fa.gz -t 4 > read_analysis.out 2> read_analysis.log
simulator.py linear -r Perca_flavescens.PFLA1.1.dna.toplevel.okseqids.fa -n 100000 -c training -o simulating > simulated.out 2> simulated.err
Hello,
I wanted to compare the results of an SV detection program when using minimap2 mapping (minisv) or NGMLR mapping (ngmlrsv) as input, on reads simulated by nanosim.
In order not to favour minisv (minimap2 being nanosim default mapper), I wanted to use nanosim with NGMLR mapping as input.
Although read_analysis.py works fine with an NGMLR bam file, when I run simulator.py asking for 100,000 simulated reads, it gets stuck, each time at a different number of generated reads, usually above 20,000.
I am using nanosim 2.2.0 and the following command lines (2nd step asking for 80G of ram): read_analysis.py -i sub10k.fa -m sub10kreads.genome.aln.sam -r Perca_flavescens.PFLA1.1.dna.toplevel.okseqids.fa.gz -t 4 > read_analysis.out 2> read_analysis.log
simulator.py linear -r Perca_flavescens.PFLA1.1.dna.toplevel.okseqids.fa -n 100000 -c training -o simulating > simulated.out 2> simulated.err
I have put the input files needed for the second step here, let me know if you need anything else? http://genoweb.toulouse.inra.fr/~sdjebali/issues/nanosim/tosend.tar.gz
Best, Sarah