broadinstitute / pilon

Pilon is an automated genome assembly improvement and variant detection tool
GNU General Public License v2.0
340 stars 60 forks source link

Coverage is always 0 #63

Closed CVan19 closed 5 years ago

CVan19 commented 6 years ago

Hi, I have just assemble a human genome using canu and I'm going to use pilon to polish the draft assembly with the following command:

java -Xmx60G -jar pilon-1.22.jar \
     --genome test.fasta \
     --bam $in_dir/bwa_sort.bam \
     --fix "bases" \
     --threads 15 \
     --output test \
     --outdir $out_dir

However , I checked the result and found the squence didn't change at all. I even tried to polish single contig but I got the same result, here is the log of pilon:

Pilon version 1.22 Wed Mar 15 16:38:30 2017 -0400
Genome: test.fasta
Fixing snps, indels
Input genome size: 1022840
Processing tig00000488|arrow:1-779472
Processing tig00000361|arrow:1-243368
tig00000361|arrow:1-243368 log:
unpaired /home/xiatian/polish/Illumina_correct_pacbio/2_samtools_output/bwa_sort.bam: coverage 0
Total Reads: 199348, Coverage: 0, minDepth: 5
Confirmed 0 of 243368 bases (0.00%)
Corrected 0 snps; 0 ambiguous bases; corrected 0 small insertions totaling 0 bases, 0 small deletions totaling 0 bases
Finished processing tig00000361|arrow:1-243368
tig00000488|arrow:1-779472 log:
unpaired /home/xiatian/polish/Illumina_correct_pacbio/2_samtools_output/bwa_sort.bam: coverage 0
Total Reads: 765929, Coverage: 0, minDepth: 5
Confirmed 0 of 779472 bases (0.00%)
Corrected 0 snps; 0 ambiguous bases; corrected 0 small insertions totaling 0 bases, 0 small deletions totaling 0 bases
Finished processing tig00000488|arrow:1-779472
Writing updated tig00000361|arrow|pilon to /home/xiatian/polish/Illumina_correct_pacbio/4_pilon_output/test.fasta
Writing updated tig00000488|arrow|pilon to /home/xiatian/polish/Illumina_correct_pacbio/4_pilon_output/test.fasta
Mean unpaired coverage: 0
Mean total coverage: 0

It shows thar coverage is always 0, but I can't figure out why this happened. Can you help me ?

CVan19 commented 6 years ago

In addition , I have just run "samtools flagstat" on my bam file , the output is shown below:

2542264173 + 0 in total (QC-passed reads + QC-failed reads)
9669786 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
2535108804 + 0 mapped (99.72% : N/A)
2532594387 + 0 paired in sequencing
1264739823 + 0 read1
1267854564 + 0 read2
0 + 0 properly paired (0.00% : N/A)
2522453600 + 0 with itself and mate mapped
2985418 + 0 singletons (0.12% : N/A)
162547448 + 0 with mate mapped to a different chr
28767110 + 0 with mate mapped to a different chr (mapQ>=5)

And the way I mapped Illumina data to my draft assembly is :

bwa mem -M -P -t 15 -R '@RG\tID:normal\tSM:normal\tLB:normalLib\tPU:runname\tCN:GenePlus\tPL:illumina' \
ren.arrow.fasta \
${in_dir}/HAP1_264_DHG15556-S_1.fq.gz ${in_dir}/HAP1_264_DHG15556-S_2.fq.gz \
> ${out_dir}/bwa.sam

time samtools view -@ 15 -bS $in_dir/bwa.sam > $out_dir/bwa.bam
time samtools sort -@ 15 -m 1536M $out_dir/bwa.bam -o $out_dir/bwa_sort.bam
time samtools index $out_dir/bwa_sort.bam $out_dir/bwa_sort.bam.bai

java -jar picard.jar MarkDuplicates \
      I=$in_dir/bwa_sort.bam \
      O=$out_dir/bwa_sort_markdup.bam \
      M=$out_dir/bwa_sort_markdup.bam.metrics \
      ASO=coordinate \
      VALIDATION_STRINGENCY=LENIENT \
      REMOVE_DUPLICATES=true

time samtools index $out_dir/bwa_sort_markdup.bam $out_dir/bwa_sort_markdup.bam.bai

I wonder whether there is a mistake in my way to map?

melop commented 6 years ago

I had the same problem before with mate-pair reads. BWA mem marks them as "improperly paired" because the insert size is > 1kb. Is that the case for your library ? I had to write a script to put the flag in the bam, then pilon worked.

conchoecia commented 6 years ago

See this BI.SO page for reference on how to use unpaired or low-coverage reads during pilon correction.