rikuu / Gap2Seq

Gap2Seq is a gap filling and insertion genotyping tool.
GNU Affero General Public License v3.0
17 stars 6 forks source link

Gap2Seq 3.1 inserting new deletions #8

Open vrohnie opened 5 years ago

vrohnie commented 5 years ago

Hey,

I've been testing Gap2Seq's insertion genotyping ability, by randomly inserting "gaps" (replacing nucleotides in the fasta reference).

  1. The results were rather impressive, as it can even fill gaps up to 1000 bp correctly.

  2. I realized something that seems to be a rather serious bug. Gap2Seq is inserting a 10bp deletion after each gap it has filled in. The deletion happens about 1 kmer after the end of the gap (I've tested it with 2 different k-mer sizes). Therefore my guess would be the bug is somewhere in recombining the assembled sequence with the rest of the reference (although I must admit I have not looked at your code). Also this only seems to happen using the --library option.

Could you please recheck that behavior? As mentioned in the header I used Gap2Seq v. 3.1.

Used Command:

python3 Gap2Seq --scaffold MyGappedReference.fasta --filled MyFilledReference.fasta --library MyLibraryFile.txt -k 73 -t 8 --max-mem 10

MyLibraryFile.txt:

MyBamFile.bam\t450\t70\t150

Best regards, Veronika