broadinstitute / pilon

Pilon is an automated genome assembly improvement and variant detection tool
GNU General Public License v2.0
340 stars 60 forks source link

Fix Ambiguous Bases - finding, possibly fixing, but not writing changes #69

Closed Roli-Wilhelm closed 5 years ago

Roli-Wilhelm commented 6 years ago

Hi There,

I'm using you fix ambiguous base command and your laudable software appears to run smoothly (i.e. find and correct ambiguous bases), but the output is identical to the input. Am I missing something, or is there perhaps a bug in what is written to file? I've had the same result running it with "amb" and with both "amb" and "snps", though in the former, the line "Corrected X snps; X ambiguous bases..." is not printed to screen.

Thanks in advance!

Example Command java -Xmx50G -jar /home/user/Software/pilon/pilon-1.22.jar --genome bin.160.contigs.fa --frags miseq.sorted.bam --frags F6.sorted.bam --frags F7.sorted.bam --output bin.160.final.contigs.fa --outdir pilon.output --fix "amb","snps" --threads 20

Example Output "Warning: experimental fix option amb Genome: metabat.160.contigs.fa Fixing amb, snps Input genome size: 8667087 ...

Processing Contig_320233:1-3309 Contig_214818:1-6519 log: frags F13.sorted.bam: coverage 0 frags F12.sorted.bam: coverage 1 frags F11.sorted.bam: coverage 2 frags F10.sorted.bam: coverage 2 frags F9.sorted.bam: coverage 1 frags F8.sorted.bam: coverage 2 frags F7.sorted.bam: coverage 4 frags F6.sorted.bam: coverage 2 frags miseq.sorted.bam: coverage 0 Total Reads: 882, Coverage: 14, minDepth: 5 Confirmed 5805 of 6506 bases (89.23%) Corrected 9 snps; 47 ambiguous bases; found 0 small insertions totaling 0 bases, 0 small deletions totaling 0 bases Finished processing Contig_214818:1-6519"

Roli-Wilhelm commented 6 years ago

I have just run the same command with the "--changes" flag and the output 'changes' file is empty which supports the idea that ambiguous bases are being identified, perhaps called, but not being written anywhere.

ammaraziz commented 6 years ago

I am having a similar issue. From trial and error this issue occurs when specifying multiple fix types with the --fix option. The documentation is not specific/does not provide an example. It says the flag takes A comma-separated list of categories of issues to try to fix. Does this mean:

  1. "bases", "amb"
  2. " "bases", "amb" "
  3. "bases, amb"

I've tried the above three and still get the same issue. How do you specify which fixes to apply?

w1bw commented 5 years ago

Sorry, I'm finally catching up on some ancient Pilon support tickets.

The correct syntax is "--fix bases,amb". I'll look into whether there is something wrong with the amb option.

w1bw commented 5 years ago

I wasn't able to reproduce this. I ran examples with "--fix snps" and "--fix snps,amb", and the latter included ambiguous base changes which weren't in the former.