Description in the title. An ideal gene-model scoring system would be something simple and easily understandable (akin to the 1-5 scale in the Uniprot "Annotation Score")? Once a scoring system is produced, can be a good method to assess those gene models that need following up for manual fixing.
Could also annotate if a gene model has been manually reviewer or not.
Some things we can assess (entries represent True/False characteristics for a gene model):
1) 3' UTR present
1a) 3' UTR correct (assessable by presence/alignment of Poly-A tail in PASA/Gmap/blat alignments of de novo transcriptome assembled transcripts)
2) 5' UTR present
2a) 5' UTR correct (hard to assess - IsoSeq w/ TeloPrime kit produced cDNA best)
3) Correct CDS C-terminus (maybe best manually assessed, but if not, trust de novo transcript DCGM the most)
4) Correct CDS N-terminus (In terms of real emperical data, bottom-up proteomics can find this, but otherwise, assess Orthogroup characteristics, e.g. like OMgene?)
5) Correct number of exons (assessable from Orthogroup characteristics)
6) Manually reviewed (True/False)
Description in the title. An ideal gene-model scoring system would be something simple and easily understandable (akin to the 1-5 scale in the Uniprot "Annotation Score")? Once a scoring system is produced, can be a good method to assess those gene models that need following up for manual fixing.
Could also annotate if a gene model has been manually reviewer or not.
Some things we can assess (entries represent True/False characteristics for a gene model):
1) 3' UTR present 1a) 3' UTR correct (assessable by presence/alignment of Poly-A tail in PASA/Gmap/blat alignments of de novo transcriptome assembled transcripts) 2) 5' UTR present 2a) 5' UTR correct (hard to assess - IsoSeq w/ TeloPrime kit produced cDNA best) 3) Correct CDS C-terminus (maybe best manually assessed, but if not, trust de novo transcript DCGM the most) 4) Correct CDS N-terminus (In terms of real emperical data, bottom-up proteomics can find this, but otherwise, assess Orthogroup characteristics, e.g. like OMgene?) 5) Correct number of exons (assessable from Orthogroup characteristics) 6) Manually reviewed (True/False)