broadinstitute / pilon

Pilon is an automated genome assembly improvement and variant detection tool
GNU General Public License v2.0
342 stars 60 forks source link

Pilon appearing to make no corrections or confimations of bases #129

Open Rob-murphys opened 4 years ago

Rob-murphys commented 4 years ago

I have generated a hybrid assembly with Pilon using the following:

pilon -Xmx30G --genome $long_assembly --frags $short_assembly_sorted --output $prefix --outdir $path/pilon_outputs --changes --vcf

However the change file appears empty after the run finishes (the run itself is oldly fast). Upon looking into the logs I see line after line similar to this:

tig00000063:1-3674 log:
Finished processing tig00000063:1-3674
Processing tig00000064:1-3596
frags /home/lamma/faststorage/kasun_amyco/bwa-output/PSU4_ISF1A_Q30alignShortOnLong_sorted.bam: coverage 0
Total Reads: 0, Coverage: 0, minDepth: 5
Confirmed 0 of 3596 bases (0.00%)
Corrected 0 snps; 0 ambiguous bases; corrected 0 small insertions totaling 0 bases, 0 small deletions totaling 0 bases

Which seems to indicate Pilon is making no corrections and no bases are being confirmed? Am I using the tool wrong or something?

nhartwic commented 4 years ago

So, what is actually happening here, is that pilon thinks your coverage in that range is 0. I don't know why this is happening to you. Most likely, you haven't specified input correctly. I got behavior very similar to yours when I was polishing with a paired end read set and accidently mapped the R1 reads twice. My mapping commands basically looked like...

# this is incorrectly mapping the R1 reads twice and confuses pilon
minimap2 -ax sr draft.fasta reads_r1.fastq reads_r1.fastq
# This is what I meant to do
minimap2 -ax sr draft.fasta reads_r1.fastq reads_r2.fastq

...So maybe you did the same thing on accident. Or made some other error with mapping. Or maybe there is a bug in pilon.