BFL-lab / Mfannot

MFannot is a program for the annotation of mitochondrial and plastid genomes
GNU General Public License v3.0
17 stars 6 forks source link

ORF prediction with non-standard genetic code sometimes are wrong. #16

Open cgjosephlee opened 5 years ago

cgjosephlee commented 5 years ago

I'm using the latest mfannot via docker image.

I'm annotating a fungal mt genome with genetic code 4, and some of the ORFs were problematic. I have cross-validated with ncbi ORFfinder, mostly they are ORFs with non-ATG start codon.

e.q.

 42090  TAAATTATGATTGTTGGGGTAAATATTATAAAATATCCGCTTATTCATTTGGTATTAA
; G-orf782 ==> start ;; contain dpo
 42148  AACATTACTTTTAACAAAAATATTGCTTTTATTCAAGTGGATAAAGGAAATGATAAAAAT
...
 44476  AAACCTTTAAAC
;;     G-dpo_2 ==> end
 44488  TTTATTGTT
; G-orf782 ==> end
 44497  TAACCCTTTGGCTTTACACTACTTTGTCTTATCTTTTTAGTTCGGCTAATCTTAGTGGCA

The start codon should be ATT (42151 bp) and stop codon should be TAA (44497 bp). The protein sequence in .sqn file is ncbieaa "-ITFNKNIAFI....

 56519  TTGGGGGAGTTAACAAATAATAAAATAATAAAATAATAAAAT
;     G-orf510 ==> start /note=LAGLIDADG ;; evalue:1.6e-28
 56561  AATAAAATAATAAATACAAATATGAGAATCTTAATACAAATATTTACAAATTCTGACTTA
...
 58061  ATAAAATCTAACATGAATATGAATAGAAGTTAA
;     G-orf510 ==> end
 58094  TATAATTTCATATGGTTTGCTAGTTAACCCCGTTCAAAATCAGACCAACTACTAATACAA

AAT is not a valid start codon and it should be ATA (56567 bp). The protein sequence in .sqn is ncbieaa "-KIINTNMRI....

I have got an error message:

...
7) Annotate genes with introns...
Use of uninitialized value $pb in string eq at /usr/local/bin/mfannot line 1610.
8) Identify gene fusions...
...

It seems to be dealing with frame-shift annotations, I don't known if this is related to the problem.

natacha-beck commented 5 years ago

@cgjosephlee, thanks to reporting this issue, can I have the full sequence to work on this issue. It will be greatly appreciated.

cgjosephlee commented 5 years ago

Try this https://www.ncbi.nlm.nih.gov/nuccore/CM008263.1

natacha-beck commented 5 years ago

Hi @cgjosephlee,

This branch issue_16 should fix this issue. At this point I only have a singularity container here containing the new code.

I have some other work to do before merging the code on the main branch, but if you want to test with the singularity version it will be appreciate.

Thanks for your comments and your help.