geneontology / pathways2GO

Code for converting between BioPAX pathways and Gene Ontology Causal Activity Models (GO-CAM)
8 stars 0 forks source link

two REACTO classes are asserted to be in both E. coli and human #122

Closed balhoff closed 3 years ago

balhoff commented 3 years ago

Both labeled Flagellin:

Reactome species says E. coli:

This causes downstream build failures (I'm attempting to incorporate REACTO into GO graphstore).

deustp01 commented 3 years ago

This may not be simply a bug. Take reaction R-HSA-975879 as an example. The biology we're trying to capture is the triggering of a normal human innate immune response as a result of the interaction between E. coli (bacterial) flagellin protein - R-ECO-1181207 - and human gene products to form the TLR5 and TLR10 signaling complexes. When this reaction proceeds normally, it prevents a bacterial infection from proceeding and thus, for us, counts as normal biology even though it involves a gene product expressed by a human pathogen.

Alternative reactions in which the innate immune signaling fails would count as disease pathways for Reactome and are removed for that reason from the body of material processed via REACTO into GO-CAM models.

The reaction itself, in both normal and disease versions, counts for Reactome as a human reaction because it is occurring in a human organism. We annotate the involvement of gene products from other species by annotating those species in the "relatedSpecies" attribute of the reaction instance. I don't know whether this attribute propagates into the BioPAX export from Reactome - if it does, it would at least be a flag to indicate an instance that can't be handled in the standard way.

ukemi commented 3 years ago

@deustp01 are there any other examples like this in Reactome? I can find bacterial carbohydrates and lipids, but can't quickly find a non-human gene product. If there are other cases, we should see why those don't fail.

ukemi commented 3 years ago

R-HSA-6807581 binds fungal proteins. R-CAL-6807575

balhoff commented 3 years ago

Thanks for commenting @deustp01 and @ukemi! I'm not yet that familiar with Ben's code that generates REACTO. Perhaps it should avoid the 'in taxon' some human for these terms. We will have to see if we can model correctly to have the reaction occur in human but that protein not be in human.

deustp01 commented 3 years ago

are there any other examples like this in Reactome?

I went to pathway "Toll-like receptor cascades" R-HSA-168898 in the pathway browser, opened the "molecules" detail tab, and scanned the list of proteins for ones whose names (= gene symbols) are not all caps. That yields two versions of E. coli flagellin fliC R-ECO-1181207 and R-ECO-167919, one version of Chlamydomonas mip R-CTR-9628834, and one version of Neisseria major outer membrane protein porB R-NME-180815. Viral and bacterial RNAs and DNAs trigger innate immune reactions but we annotate these as chemicals (hence CHEBI IDs), not gene products.

ukemi commented 3 years ago

we can model correctly to have the reaction occur in human but that protein need not be a human protein.