broadinstitute / pilon

Pilon is an automated genome assembly improvement and variant detection tool
GNU General Public License v2.0
340 stars 60 forks source link

Overwriting corrected.changes each contig processed? #19

Closed tseemann closed 8 years ago

tseemann commented 8 years ago

18

I usually use Pilon with a single sequence but we are correcting 3 contigs (chr and 2 plasmids) and am getting unusual output:

pilon --genome all_contigs.fa --frags aln.bam --outdir /tmp/pilon --output corrected --fix bases --changes --threads 32 --verbose --debug

gives

<snip>
Finished processing contig1:1-2725222
Writing contig1:1-2725222 changes to /tmp/pilon/corrected.changes
Writing updated contig1_pilon to /tmp/pilon/corrected.fasta
Writing contig2:1-24885 changes to /tmp/pilon/corrected.changes
Writing updated contig2_pilon to /tmp/pilon/corrected.fasta
Writing contig3b:1-2252 changes to /tmp/pilon/corrected.changes
Writing updated contig3b_pilon to /tmp/pilon/corrected.fasta
Mean frags coverage: 138

But the output files are wrong:

-rw-r--r--. 1 tseemann domain^users       0 Sep  8 21:31 corrected.changes
-rw-r--r--. 1 tseemann domain^users 2786812 Sep  8 21:31 corrected.fasta

The changes file is empty, and the corrected fasta hasn't got any changes.

contig1_pilon   dna     2725222
contig2_pilon   dna     24885
contig3b_pilon  dna     2252

contig1 dna     2725222
contig2 dna     24885
contig3b        dna     2252

There are changes. If you run it with a single contig alone it seems to give changes.

The BAM looks are ok too:

2983184 + 0 in total (QC-passed reads + QC-failed reads)                                                                    0 + 0 secondary                                                                                                             
2214 + 0 supplementary                                                                                                      
0 + 0 duplicates                                                                                                           
2975585 + 0 mapped (99.75% : N/A)                                                                                           2980970 + 0 paired in sequencing                                                                                            1490485 + 0 read1                                                                                                           
1490485 + 0 read2                                                                                                           
2962950 + 0 properly paired (99.40% : N/A)                                                                                  2969200 + 0 with itself and mate mapped                                                                                     4171 + 0 singletons (0.14% : N/A)                                                                                           
#28 + 0 with mate mapped to a different chr
28 + 0 with mate mapped to a different chr (mapQ>=5)

The aligner was BWA MEM, and the RAM for pilon was 16 GB.

Any ideas?

@AnnaSyme

tseemann commented 8 years ago

There might be something else going on, I'll do more investigation and re-open if I find it.

Sorry to bother you.

CVan19 commented 6 years ago

Hello, I met the same problem as yours. The corrected fasta hasn't got any changes and the log of pilon shows the coverage is 0. Do you have dealt with your problem ?

Pilon version 1.22 Wed Mar 15 16:38:30 2017 -0400
Genome: test.fasta
Fixing snps, indels
Input genome size: 1022840
Processing tig00000488|arrow:1-779472
Processing tig00000361|arrow:1-243368
tig00000361|arrow:1-243368 log:
unpaired /home/xiatian/polish/Illumina_correct_pacbio/2_samtools_output/bwa_sort.bam: coverage 0
Total Reads: 199348, Coverage: 0, minDepth: 5
Confirmed 0 of 243368 bases (0.00%)
Corrected 0 snps; 0 ambiguous bases; corrected 0 small insertions totaling 0 bases, 0 small deletions totaling 0 bases
Finished processing tig00000361|arrow:1-243368
tig00000488|arrow:1-779472 log:
unpaired /home/xiatian/polish/Illumina_correct_pacbio/2_samtools_output/bwa_sort.bam: coverage 0
Total Reads: 765929, Coverage: 0, minDepth: 5
Confirmed 0 of 779472 bases (0.00%)
Corrected 0 snps; 0 ambiguous bases; corrected 0 small insertions totaling 0 bases, 0 small deletions totaling 0 bases
Finished processing tig00000488|arrow:1-779472
Writing updated tig00000361|arrow|pilon to /home/xiatian/polish/Illumina_correct_pacbio/4_pilon_output/test.fasta
Writing updated tig00000488|arrow|pilon to /home/xiatian/polish/Illumina_correct_pacbio/4_pilon_output/test.fasta
Mean unpaired coverage: 0
Mean total coverage: 0
tseemann commented 6 years ago

@CVan19 I think it was because I had | in my fasta IDs.