RTXteam / RTX-KG2

Build system for the RTX-KG2 biomedical knowledge graph, part of the ARAX reasoning system (https://github.com/RTXTeam/RTX)
MIT License
38 stars 8 forks source link

Reactome proteins have Biolink category `MolecularEntity`; should be `Protein` #192

Open saramsey opened 2 years ago

saramsey commented 2 years ago

For example:

        {
            "id": "REACT:R-HSA-9720659",
            "iri": "https://identifiers.org/reactome:R-HSA-9720659",
            "name": "TP53 F109Sfs*14 [nucleoplasm]",
            "full_name": "TP53 F109Sfs*14 [nucleoplasm]",
            "category": "biolink:MolecularEntity",
            "category_label": "molecular_entity",
            "description": null,
            "synonym": [],
            "publications": [],
            "creation_date": "2021-03-18 14:40:01",
            "update_date": "2021-08-26 15:53:48",
            "deprecated": false,
            "replaced_by": null,
            "knowledge_source": "identifiers_org_registry:reactome",
            "has_biological_sequence": null
        },

this issue was reported by Chris Bizon, via https://github.com/RTXteam/RTX/issues/1772

saramsey commented 2 years ago

also, in passing, I noticed a lot of these errors in the logfile for reactome_mysql_to_kg_json.py, which look fixable:

The source Guide to Pharmacology is not in the name_prefix_dict

and

The source NCIthesaurus is not in the name_prefix_dict
saramsey commented 2 years ago

should be fixed by 1f5c517; @acevedol can you test in the next build, please?

saramsey commented 2 years ago

After commit 1f5c517, looks like this:

        {
            "id": "REACT:R-HSA-9720659",
            "iri": "https://identifiers.org/reactome:R-HSA-9720659",
            "name": "TP53 F109Sfs*14 [nucleoplasm]",
            "full_name": "TP53 F109Sfs*14 [nucleoplasm]",
            "category": "biolink:Protein",
            "category_label": "protein",
            "description": null,
            "synonym": [],
            "publications": [],
            "creation_date": "2021-03-18 14:40:01",
            "update_date": "2021-08-26 15:53:48",
            "deprecated": false,
            "replaced_by": null,
            "knowledge_source": "identifiers_org_registry:reactome",
            "has_biological_sequence": null
        },
saramsey commented 2 years ago

Pending clarification from Sierra Moxon, I am switching protein-like drugs (from Reactome and DrugBank) to biolink:ChemicalEntity rather than biolink:Protein, for consistency with our previous curation.