Closed rgladstone closed 8 months ago
Hi Rebecca, could you provide an example of one of these genes that ggCaller is missing?
Thanks, here are the two annotations, one for Prokka that also contains the sequence and one from ggcaller. I've added a txt extension as github wouldn't accept .gff. It's the first gene 1-777 that isn't being captured by ggcaller.
The missing gene sequence may have been truncated or elongated incorrectly by ggCaller. Would you be able to send across the full ggCaller output folder, please? I can take a look in more detail.
Sure, I've zipped it up here
Hi Rebecca, I've now implemented a change that should sort the issue in commit bf39e1c.
Closed as inactive.
I ran ggcaller on 66 capsular loci extracted from references. The first base is the first base of the first gene, and the last base is the last base of the last gene.
ggcaller --refs db_fasta.txt --aligner ref --alignment pan --clean-mode sensitive --annotation sensitive --save --threads 32 --merge-paralogs --search-radius 30000 --max-orf-orf-distance 30000
For all 66 references the first gene is not annotated by ggcaller, even though it starts with the start codon 'atg' methionine. Prokka does annotate these first genes, and ggcaller has no issue when the last base of the stop codon is the last base of the contig. The results have one gene less than prokka + panaroo (using the same settings).