Hikoyu / FATE

Framework for Annotating Translatable Exons
MIT License
5 stars 1 forks source link

criteria for FATE biotype #1

Open maedat opened 6 years ago

maedat commented 6 years ago

Hi

How FATE decide the biotype of each hit? I could not find the description of the criteria for "blue/yellow/red" in the 9th column.

Hikoyu commented 6 years ago

It is little bit complex. Briefly, if the predicted gene structure clearly expresses pseudogenization, 'red' is assigned. (e.g. internal stop codon, frameshift, or too short cds) However, too short cds resulted from the bad assembly, 'yellow' is assigned. None of the above-mentioned are applied, 'blue' is assigned.

maedat commented 6 years ago

Great! Now I can agree that "red" genes have serious problems as active genes. Can I ask more details? What are the criteria for the "too short" CDS?

Hikoyu commented 6 years ago

The criteria is defined by -c option. The cds with less than 85% of the query sequence length is regarded as a pseudogene by default. Namely, this criteria is applied for each end of sequence. (see the following figure) In addition, -l option defined minimum length to regard as complete cds. fate_fig