Closed gconcepcion closed 7 years ago
My guess is that it's a bug in pilon's handling of the pacbio read alignment cigar string. I can try to look at it from the stack trace, but it might take an example to find.
However, I have recommended against using raw pacbio aligned reads for pilon as a majority of the input reads, because Pilon isn't specifically aware of the PB error model. There will likely be spurious indels corrections, especially in homopolymer runs. Generally people use pilon to correct PB assemblies with illumina data, or use error-corrected or circular consensus PB reads.
Someday I may try to add specific pacbio awareness so that it will do a better job with raw reads.
Thanks for the insight - I got much better results (and no run-time errors) when I mapped the corrected reads back to the consensus sequence.
If I come up with a small test-case example with data I can share, I'll send it your way.
Hey, I have received a similar error when I try to use Illumina short reads to polish a genome assembly (1956 contigs) generated from Nanopore long reads (genome = 965Mbp). The cause:
Caused by: java.lang.ArrayIndexOutOfBoundsException: -110
What does this mean please?
I'm using pilon to attempt correction of a pacbio genome (contig by contig) from raw pacbio reads aligned to a consensus sequence. Total genome size is roughly 130Mb split among 47 contigs.
Here is an example of one such command:
Pilon completes successfully for the first and largest 14Mb contig, fails for the next 3 contigs (sorted by size), and then works for the rest of the dataset. The error message I get is:
Any idea what's happening? Thanks!