WormBase / genedesc_generator

Automated gene descriptions generator for model organism databases
Other
1 stars 0 forks source link

Writing descriptions for information poor C. elegans genes #17

Closed rankishore closed 5 years ago

rankishore commented 6 years ago

1. For those elegans genes that have Orthology to human and no GO data: Write a sentence with that human ortholog's GO Molecular Function (MF) information: -pick the best ortholog by Alliance stringency and best filters

Templates: human < gene symbol > exhibits < MF term > human < gene symbol > is predicted to have < MF term > human < gene symbol > is a/an < MF term > human < gene symbol > is predicted to be a/an < MF term >

(Follow all rules for MF terms in general)

2. For those genes with no GO data (either for the focus gene or the human ortholog GO MF ) (may or may not have orthology and/or tissue expression data)

Templates: If only one domain: Predicted to encode a protein with the following domain: < protein domain1 >;

If more than one domain: Predicted to encode a protein with the following domains: < protein domain1 > and < protein domain2 >;

Store the INTERPRO IDs to put in the reference.

3. For those genes with no Orthology, GO and tissue expression data for the focus gene --Add expression cluster data, with all three being added if it exists for a gene (anatomy, gene regulation and chemical regulation) --Add protein domain data