Closed AnRibo closed 3 years ago
I think the option --proteins
just annotates each gene after it is localised. I mean, PROKKA first looks for nucleotides that could correspond to a gen, and, then, it uses the annotations of the reference proteins to describe=annotate that gen. I think the option --proteins
does nothing in identifying your large gene.
@AnRibo what is your organism, and what is very large
?
--proteins
(as alluded by @ireneortega) is used to lift annotations from one genome to another after the CDS regions have been found.
prokka
uses prodigal
as its main CDS finding engine. You can try to supply prokka
with a training file for your genome, which includes the gene:
https://github.com/hyattpd/prodigal/wiki/Gene-Prediction-Modes#training-mode
And, then specify the training file in prokka
with --prodigaltf
:
https://github.com/tseemann/prokka#command-line-options
And:
Thank you very much @ireneortega and @andersgs !
@AnRibo did it work?
I have a very large gene that does not get annotated. Prokka will rather choose to annotate a hypothetical gene in the opposite strand, even though there are no start codons there. When BLASTing this hypothetical gene I find nothing. Other software finds my gene with no issues. Adding the gene with --proteins does not help. --proteins does work on other genes. The problem could be that there are several thousand bp with no stop codons in reverse strand, but adding the gene with --proteins should take priority?