geneontology / gopreprocess

MIT License
3 stars 1 forks source link

convert abstract and title text of GO_REF:0000096 to describe new preprocessing pipeline here instead of MGI curation process #5

Closed sierra-moxon closed 5 months ago

sierra-moxon commented 1 year ago
ukemi commented 1 year ago

Note that GO_ref:0000096 corresponds to both J:164563 (human ISO load) and J:155856 (rat ISO load).

The human abstract at MGI is currently: Experimental data from the human gene association files are downloaded from the GO Consortium website (http://www.geneontology.org/). Human annotations with experimental evidence codes (IDA, IMP IPI, IGI, and EXP) are automatically associated with the orthologous mouse genes with an evidence code of ISO (Inferred from Sequence Orthology; for more information about GO evidence codes, their meaning and how they are used, go to the GO Consortium evidence code documentation at http://www.geneontology.org/GO.evidence.shtml). The original reference PubMed ID that provides the source data for the human annotation is maintained in the MGI system. The laboratory mouse is used as models for the study of human diseases. Both the Mouse Genome Database (MGD) and HGNC have extensive procedures in place, overseen by expert curation, to establish orthology relationships between their genes. Both MGI and GO Annotation at EBI (UniProtKB-GOA) engage in curated Gene Ontology (GO) annotation using the experimental literature available for each organism and adhering rigorously to the guidelines set forth by the GOC.

The rat abstract at MGI is: Both the Mouse Genome Database (MGD) and the Rat Genome Database (RGD) have extensive procedures in place, overseen by expert curation, to establish orthology relationships between their genes. Furthermore, each database engages in curated Gene Ontology (GO) annotation using the experimental literature available for each organism, and adhering rigorously to the guidelines set forth by the GOC. Experimental data from the rat gene association files are downloaded from the GO Consortium website (http://www.geneontology.org/). Rat annotations with experimental evidence codes (IDA, IMP IPI, IGI or IEP) are automatically associated with the orthologous mouse genes with an evidence code of ISO (Inferred from Sequence Orthology; for more information about GO evidence codes, their meaning and how they are used, go to the GO Consortium evidence code documentation at http://www.geneontology.org/GO.evidence.shtml). The original reference PubMed ID that provides the source data for the rat annotation is maintained in the MGI system.

GO_REF:0000096 is: Automated transfer of experimentally-verified manual GO annotation data to close orthologs. Mouse Genome Database (MGD), The HUGO Gene Nomenclature Committee (HGNC), and Rat Genome Database (RGD) have extensive procedures in place, overseen by expert curation, to establish orthology relationships between their genes. The Experimentally based annotations annotated by each group (IDA, IMP IPI, IGI, and EXP) are used to provide annotations to the respective mouse and rat orthologs, and given the ISO evidence code and an entry in the inferred_from field to indicate the orthologous entity.

Note that the GO_REF covers both rat and human. I suggest that we retain the GO_REF and modify its text to reflect the new methodology using The Alliance Orthology calls and sequence assignments made in MGI.

ukemi commented 1 year ago

Draft new GO_REF:0000096

Automated transfer of experimentally-verified manual GO annotation data to close orthologs. Mouse Genome Database (MGD) and The Alliance of Genome Resources have extensive procedures in place to establish orthology relationships between genes. The experimentally-based annotations provided by model-organism groups (IDA, IMP IPI, IGI, and EXP) are used to provide annotations to the respective orthologs, and given the ISO evidence code and an entry in the inferred_from field to indicate the orthologous entity.

kltm commented 6 months ago

@LiNiMGI Is there a current state on this?

LiNiMGI commented 6 months ago

@sierra-moxon Would you please double check the abstracts for the GO_REFs that are being used to make sure they reflect how the pipeline is working? thanks, Li

sierra-moxon commented 6 months ago

@pkalita-lbl - what to I do here? :)

kltm commented 6 months ago

@pkalita-lbl For context, we want to make some changes to the GO_REFs for this project, but don't want to weird what you have going on. Would it be better to just stick to the old SOP, or do the old and new at the same time to save you the trouble of syncing later?

pkalita-lbl commented 6 months ago

With https://github.com/geneontology/go-site/pull/2253 being merged in, I'd appreciate that if you need to an update an abstract that you update it in both the metadata/gorefs/goref-NNNNNNN.md and metadata/gorefs.yaml file. If you only update the Markdown file it’s not the end of the world, I’ll get them sync’d up later. We’re just in a bit of a multiple-sources-of-truth situation until this project wraps up.

pgaudet commented 5 months ago

Updated the go-ref.md and new yaml file to

GO_REF:0000096: • title:Automated transfer of experimentally-verified manual GO annotation data to mouse-rat orthologs. • author: The Gene Ontology Consortium • external_accession: J:155856 and RGD:1624291 • id: GO_REF:0000096 • year: 2023 • layout: goref • abstract: The Alliance of Genome Resources (https://www.alliancegenome.org/) has procedures in place to establish orthology relationships between genes. The experimentally-based annotations (IDA, IMP IPI, IGI, and EXP) annotated by Rat Genome Database (RGD) and The Mouse Genome Database (MGD) are used to provide annotations to the respective mouse and rat orthologs, and given the ISO evidence code and an entry in the "With (or) From" field to indicate the orthologous entity.

GO_REF:0000119: • title:Automated transfer of experimentally-verified manual GO annotation data to mouse-human orthologs. • author: The Gene Ontology Consortium • external_accession: J:164563 • id: GO_REF:0000119 • year: 2023 • layout: goref • abstract: The Alliance of Genome Resources (https://www.alliancegenome.org/) has procedures in place to establish orthology relationships between genes. The experimentally-based annotations (IDA, IMP IPI, IGI, and EXP) for human genes generated by the GOA pipeline are used to provide annotations to the respective mouse orthologs, and given the ISO evidence code and an entry in the "With (or) From" field to indicate the orthologous entity.

pgaudet commented 5 months ago

@pkalita-lbl After you review the PR (and merge) for the yaml file, this ticket can be closed.

pkalita-lbl commented 5 months ago

I reviewed and merged https://github.com/geneontology/go-site/pull/2259