RTXteam / RTX

Software repo for Team Expander Agent (Oregon State U., Institute for Systems Biology, and Penn State U.)
https://arax.ncats.io/
MIT License
33 stars 21 forks source link

chemicals, diseases returned as process/activity #1803

Closed cbizon closed 1 year ago

cbizon commented 2 years ago

This query looks for BiologicalProcessOrActivity linked to a chemical:

{
    "message": {
        "query_graph": {
            "edges": {
                "e00": {
                    "subject": "n01",
                    "object": "n00"
                }
            },
            "nodes": {
                "n00": {
                    "ids": [
                        "PUBCHEM.COMPOUND:42611257"
                    ]
                },
                "n01": {
                    "categories": [
                        "biolink:BiologicalProcessOrActivity"
                    ]
                }
            }
        }
    }
}

31 n01's come back, and most of them look good. But there are a few that I think are either diseases: MONDO:0005105 melanoma MONDO:0018874 acute myeloid leukemia MONDO:0021042 glioma MONDO:0000605 hypersensitivity reaction disease MONDO:0005335 colorectal neoplasm

Or proteins: UniProtKB:P08581 MET UniProtKB:Q06124 PTPN11 UniProtKB:P09681 GIP CHEMBL.COMPOUND:CHEMBL1201565 EPOETIN ALFA

amykglen commented 1 year ago

this was due to a problem with our synonymizer, which has been fixed in the latest version #2003 - we expect to deploy it to dev later this week

amykglen commented 1 year ago

this is all fixed on our dev/CI instances, with the exception of 'melanoma'; there is a KEGG node in RTX-KG2 named 'Melanoma' that has a category of Pathway, so it is still returned for this query.

I'm going to close this issue since there's an open RTX-KG2 issue on this incorrect category assignment to KEGG nodes: https://github.com/RTXteam/RTX-KG2/issues/210