Closed dustine32 closed 1 year ago
The SGD pathway BioPAX file basename (and eventual model_id
) is the MetaCyc
ID that can be looked up in the GO ontology to find BP term(s). Example for SO4ASSIM-PWY.owl
:
<!-- http://purl.obolibrary.org/obo/GO_0019379 -->
<owl:Class rdf:about="http://purl.obolibrary.org/obo/GO_0019379">
<rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/GO_0000103"/>
<rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/GO_0019419"/>
<obo1:IAO_0000115 rdf:datatype="http://www.w3.org/2001/XMLSchema#string">The pathway by which inorganic sulfate is processed and incorporated into sulfated compounds, where the phosphoadenylyl sulfate reduction step is catalyzed by the enzyme phosphoadenylyl-sulfate reductase (thioredoxin) (EC:1.8.4.8).</obo1:IAO_0000115>
<oboInOwl:hasDbXref rdf:datatype="http://www.w3.org/2001/XMLSchema#string">MetaCyc:SO4ASSIM-PWY</oboInOwl:hasDbXref>
So, GO:0019379 "sulfate assimilation, phosphoadenylyl sulfate reduction by phosphoadenylyl-sulfate reductase (thioredoxin)" should be the pathway BP.
Will try retrieving this for all pathways and report back which ones either have no mapping or more than one.
Results!:
oh ok... it seems kinda hacky to get it from the filename but I guess if they don't actually put it in the biopax that's what we've got to do!
Looking at the unmapped
Some seem to be specific to yeastpathways and not in metacyc
YEAST-GALACT-METAB-PWY
=>
https://pathway.yeastgenome.org/YEAST/NEW-IMAGE?type=PATHWAY&object=YEAST-GALACT-METAB-PWY
at the top we have "galactose degradation"
if you can give me all these labels I can map them to GO
e.g. this one to:
id: GO:0019388 name: galactose catabolic process namespace: biological_process def: "The chemical reactions and pathways resulting in the breakdown of galactose, the aldohexose galacto-hexose." [ISBN:0198506732] synonym: "galactose breakdown" EXACT [] synonym: "galactose catabolism" EXACT [] synonym: "galactose degradation" EXACT [] xref: MetaCyc:GALDEG-PWY intersection_of: GO:0009056 ! catabolic process intersection_of: has_primary_input CHEBI:28260 ! galactose
Now weirdly that GO term is already mapped to something that metacyc is resticted to proteobacteria https://metacyc.org/META/new-image?type=PATHWAY&object=GALDEG-PWY
I think that GO term should be mapped to the more generic:
https://metacyc.org/META/NEW-IMAGE?type=ECOCYC-CLASS&object=GALACTOSE-DEGRADATION
I think the metacyc xrefs in GO are a bit suspect so I'd like check your positive list to
@cmungall Attached is the full list TSV of pathways with columns:
yeast_pathway_id_labels_gos.txt So this is the combined positive (if GO in col3) and negative list.
Tagging @thomaspd
Closing this for now since I just merged the code to pull these mappings from the upstream GO. Mappings can be added/improved in the GO itself.
For emitting the correct GO BP term for each pathway, we should grab the MetaCyc ID from the BioPAX (might be something like
BioCyc:...
then xref this to a GO term in the GO ontology.Tagging @cmungall