Open robfenton opened 2 years ago
Hi @robfenton! Thank you for opening an issue and also for detailing the version, command, and OS. Just to clarify, the only difference I am seeing is that sequence 09578-19-1.fasta
seems to be missing all of the variants in the first figure.
In order to to track this bug now, we'll first need to determine if it came from parsnp, harvest-tools, or gingr. Would you be able to share the .xmfa
output of your analysis? You're welcome to email the output to brycekille@gmail.com as opposed to posting it here if you'd prefer.
That's not quite it. 1956.fasta
and 1956_Chromosome.gb.fna
show different sequences, even though they're supposed to be exactly the same.
I'll double check with my boss on Monday and send both .xmfa
output files if I can.
Thanks!
We have 41 complete, closed, Salmonella sequences (40 unique). For a reference to get annotations, we exported one (1956) as both a genbank file, and a fasta file. Looking at the output in gingr (see screenshots) there have been several nucleotides changed from what is in the genbank and fasta.
After verifying the sequences involved, we re-ran the analysis with no changes, and received a slightly different output (see screenshots, isolate 09578-19-1).
1956_Chromosome.gb contains the GenBank record for the chromosome, and 1956.fasta contains both the chromosome and plasmid sequences. The chromosome sequences from both files is identical.
Using 1.7.2 installed via conda Ubuntu 20.04 LTS in VirtualBox on Windows 10. ginger 1.3 from the Linux64-v1.3 tarball GenBank and fasta files all produced by Geneious 2022.1.1 Annotated with PGAP (2021-01-11.build5132)
The only options used were -p 8, -g genbankfile.gb, and -d ./oursequences