nigyta / dfast_core

DDBJ Fast Annotation and Submission Tool
77 stars 14 forks source link

pseudogene prediction #37

Open eddykay310 opened 3 years ago

eddykay310 commented 3 years ago

Hi

Please does DFAST handle replicate pseudogenes in the case of stop codons that fragment genes and therefore generate multiple hits for a single gene.

Thanks

nigyta commented 2 years ago

Sorry for the late response. Yes, CDSs are fragmented when there is a stop codon inside of it. DFAST first try to find such fragmented CDSs based on the coverage against the reference sequence. Then, they are re-aligned to the same reference protein after extending the both 5'- and 3'-ends of the CDSs. When the stop codon or frameshift is found in the extended region, the CDS is annotated as pseudogene.