Closed balhoff closed 3 years ago
This may not be simply a bug. Take reaction R-HSA-975879 as an example. The biology we're trying to capture is the triggering of a normal human innate immune response as a result of the interaction between E. coli (bacterial) flagellin protein - R-ECO-1181207 - and human gene products to form the TLR5 and TLR10 signaling complexes. When this reaction proceeds normally, it prevents a bacterial infection from proceeding and thus, for us, counts as normal biology even though it involves a gene product expressed by a human pathogen.
Alternative reactions in which the innate immune signaling fails would count as disease pathways for Reactome and are removed for that reason from the body of material processed via REACTO into GO-CAM models.
The reaction itself, in both normal and disease versions, counts for Reactome as a human reaction because it is occurring in a human organism. We annotate the involvement of gene products from other species by annotating those species in the "relatedSpecies" attribute of the reaction instance. I don't know whether this attribute propagates into the BioPAX export from Reactome - if it does, it would at least be a flag to indicate an instance that can't be handled in the standard way.
@deustp01 are there any other examples like this in Reactome? I can find bacterial carbohydrates and lipids, but can't quickly find a non-human gene product. If there are other cases, we should see why those don't fail.
R-HSA-6807581 binds fungal proteins. R-CAL-6807575
Thanks for commenting @deustp01 and @ukemi! I'm not yet that familiar with Ben's code that generates REACTO. Perhaps it should avoid the 'in taxon' some human
for these terms. We will have to see if we can model correctly to have the reaction occur in human but that protein not be in human.
are there any other examples like this in Reactome?
I went to pathway "Toll-like receptor cascades" R-HSA-168898 in the pathway browser, opened the "molecules" detail tab, and scanned the list of proteins for ones whose names (= gene symbols) are not all caps. That yields two versions of E. coli flagellin fliC R-ECO-1181207 and R-ECO-167919, one version of Chlamydomonas mip R-CTR-9628834, and one version of Neisseria major outer membrane protein porB R-NME-180815. Viral and bacterial RNAs and DNAs trigger innate immune reactions but we annotate these as chemicals (hence CHEBI IDs), not gene products.
we can model correctly to have the reaction occur in human but that protein need not be a human protein.
Both labeled Flagellin:
Reactome species says E. coli:
This causes downstream build failures (I'm attempting to incorporate REACTO into GO graphstore).