RTXteam / RTX-KG2

Build system for the RTX-KG2 biomedical knowledge graph, part of the ARAX reasoning system (https://github.com/RTXTeam/RTX)
MIT License
39 stars 8 forks source link

Some invalid curie ids were found in KG2.5.2? #8

Open chunyuma opened 3 years ago

chunyuma commented 3 years ago

Just found some curies which might be invalid curies in KG2.5.2. Most of them are isolated curies but some of them are clustered with other curies in KG2.5.2C and has links with other curies.

Here is one example:

This curie is "HGNC:PMID"

{   "iri": "https://identifiers.org/hgnc:PMID",   
"category_label": "gene",   
"deprecated": "False",  
 "name": "Pubmed ID",  
 "description": "COMMENTS: Pubmed ID",  
 "provided_by": "umls_source:HGNC",   
"id": "HGNC:PMID",   
"category": "biolink:Gene",  
 "update_date": "2019" }

In KG2.5.2C, it is clustered with "OBI:0001617" which is classified as 'biolink:Gene' but has name 'Pubmed ID'

{
  "iri": "https://identifiers.org/hgnc:PMID",
  "expanded_categories": [
    "biolink:BiologicalEntity",
    "biolink:Gene",
    "biolink:GenomicEntity",
    "biolink:MolecularEntity",
    "biolink:NamedThing",
    "biolink:Protein"
  ],
  "name": "Pubmed ID",
  "description": "COMMENTS: Pubmed ID",
  "equivalent_curies": [
    "HGNC:PMID",
    "OBI:0001617"
  ],
  "id": "HGNC:PMID",
  "category": "biolink:Gene",
  "all_names": [
    "PubMed ID",
    "Pubmed ID"
  ],
  "all_categories": [
    "biolink:Gene",
    "biolink:InformationContentEntity"
  ]
}

Screen Shot 2021-04-01 at 7 22 42 PM

In KG2.5.2, I found lots of curies like this case:

n.id n.name n.category
"HGNC:DATE_NAME_CHANGED" "Date name changed" "biolink:Gene"
"HGNC:OMIM_ID" "Omim id" "biolink:Gene"
"HGNC:PREV_SYMBOL" "Previous symbol" "biolink:Gene"
"HGNC:ENTREZGENE_ID" "EntrezGene ID" "biolink:Gene"
"HGNC:EZ" "Ez" "biolink:Gene"
"HGNC:LOCUS_GROUP" "Locus group" "biolink:Gene"
"HGNC:MAPPED_UCSC_ID" "Mapped ucsc id" "biolink:Gene"
"HGNC:GENESYMBOL" "Gene Symbol" "biolink:Gene"
n.id n.name n.category
"PR:PSI-MOD-label" "Unique short label curated by PSI-MOD" "biolink:Protein"
"PR:has_gene_template" "has_gene_template" "biolink:Protein"
"PR:lacks_part" "lacks_part" "biolink:Protein"
"PR:PRO-proteoform-std" "Synonyms for proteoforms based on use of UniProtKb accession, subsequence range, and positions and types of modifications or variations" "biolink:Protein"
"PR:PRO-proteoform-ftid" "Synonyms for proteoforms based on use of UniProtKb feature identifier (FTId) and positions and types of modifications or variations" "biolink:Protein"
"PR:PRO-common-name" "Label appended to organism-specific terms in place of scientific name" "biolink:Protein"
n.id n.name n.category
"ICD10:USE_ADDITIONAL" "Use additional" "biolink:Disease"
"ICD10:SIB" "Inverse of SIB" "biolink:Disease"
"ICD10:CODE_FIRST" "Code first" "biolink:Disease"
"ICD10:NOTE" "Note" "biolink:Disease"
"ICD10:ORDER_NO" "Order number" "biolink:Disease"
"ICD10:CODE_ALSO" "Code also" "biolink:Disease"
"NCIT:Alliance" "Growing Teratoma Syndrome" "biolink:Disease"
"NCIT:TARGET" "t(8;21)" "biolink:Disease"
n.id n.name n.category
"NDDF:FL" "Nail film solution (ml)" "biolink:Drug"
"RXNORM:RXN_BOSS_FROM" "RXN Boss From" "biolink:Drug"
"RXNORM:RXN_BOSS_AI" "RXN Boss AI" "biolink:Drug"
"RXNORM:RXN_BOSS_AM" "RXN Boss AM" "biolink:Drug"
"RXNORM:precise_ingredient_of" "Precise ingredient of" "biolink:Drug"
"RXNORM:RXN_QUALITATIVE_DISTINCTION" "Rxn qualitative distinction" "biolink:Drug"
"RXNORM:ingredient_of" "Ingredient of" "biolink:Drug"
"RXNORM:RXN_STRENGTH" "Rxn strength" "biolink:Drug"
"RXNORM:RXN_BN_CARDINALITY" "Rxn bn cardinality" "biolink:Drug"
"RXNORM:contained_in" "Inverse of contains" "biolink:Drug"
n.id n.name n.category
"NCIT:BIRADS" "Extremely dense" "biolink:DiseaseOrPhenotypicFeature"
"NCIT:MPSImP" "Complete; no air/contrast in laryngeal vestibule" "biolink:DiseaseOrPhenotypicFeature"
"NCIT:CPTAC" "Metastatic Diagnosis" "biolink:DiseaseOrPhenotypicFeature"
"NCIT:BBPS" "Segment Score 1" "biolink:DiseaseOrPhenotypicFeature"
"NCIT:ePRO" "Entering Answers" "biolink:DiseaseOrPhenotypicFeature"
"NCIT:SPAAT" "Spaat 7" "biolink:DiseaseOrPhenotypicFeature"
"NCIT:HNH" "Not true at all" "biolink:DiseaseOrPhenotypicFeature"
"NCIT:CDNH" "Engaged to be married" "biolink:DiseaseOrPhenotypicFeature"
"NCIT:MSTS" "M1" "biolink:DiseaseOrPhenotypicFeature"
n.id n.name n.category
"NCBITaxon:DIV" "Div" "biolink:OrganismTaxon"
"NCBITaxon:RANK" "Rank" "biolink:OrganismTaxon"
"NCBITaxon:superkingdom" "superkingdom" "biolink:OrganismTaxon"
"NCBITaxon:misnomer" "misnomer" "biolink:OrganismTaxon"
"NCBITaxon:genbank_common_name" "genbank common name" "biolink:OrganismTaxon"
"NCBITaxon:subfamily" "subfamily" "biolink:OrganismTaxon"
"NCBITaxon:infraorder" "infraorder" "biolink:OrganismTaxon"
"NCBITaxon:in_part" "in-part" "biolink:OrganismTaxon"
"NCBITaxon:has_rank" "has_rank" "biolink:OrganismTaxon"
"NCBITaxon:subclass" "subclass" "biolink:OrganismTaxon"
ecwood commented 3 years ago

This is certainly related to the ontobio issue. The owl:DatatypeProperty's are being stored as nodes. From umls-hgnc.ttl, there is this line, matching your example above:

umls-hgnc.ttl-<http://purl.bioontology.org/ontology/HGNC/PMID> a owl:DatatypeProperty ;
umls-hgnc.ttl:  rdfs:label """Pubmed ID""";
umls-hgnc.ttl:  rdfs:comment """Pubmed ID""" .

Here's this from umls-ncbi.ttl:

<http://purl.bioontology.org/ontology/NCBITAXON/RANK> a owl:DatatypeProperty ;
        rdfs:label """RANK""";
        rdfs:comment """NCBI Rank (e.g. RANK[NCBI]species)""" .

However, with some of these, it seems to be an issue with the data itself:

taxslim.owl-    <!-- http://purl.obolibrary.org/obo/ncbitaxon#genbank_common_name -->
taxslim.owl-
taxslim.owl-    <owl:AnnotationProperty rdf:about="http://purl.obolibrary.org/obo/ncbitaxon#genbank_common_name">
taxslim.owl:        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">genbank common name</rdfs:label>
taxslim.owl-        <rdfs:subPropertyOf rdf:resource="http://www.geneontology.org/formats/oboInOwl#SynonymTypeProperty"/>
taxslim.owl-    </owl:AnnotationProperty>
taxslim.owl:    <!-- http://purl.obolibrary.org/obo/NCBITaxon_superkingdom -->
taxslim.owl-
taxslim.owl:    <owl:Class rdf:about="http://purl.obolibrary.org/obo/NCBITaxon_superkingdom">
taxslim.owl-        <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/NCBITaxon#_taxonomic_rank"/>
taxslim.owl-        <oboInOwl:hasOBONamespace rdf:datatype="http://www.w3.org/2001/XMLSchema#string">ncbi_taxonomy</oboInOwl:hasOBONamespace>
taxslim.owl:        <oboInOwl:id rdf:datatype="http://www.w3.org/2001/XMLSchema#string">NCBITaxon:superkingdom</oboInOwl:id>
taxslim.owl:        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">superkingdom</rdfs:label>
taxslim.owl-    </owl:Class>

This is also from taxslim.owl:

    <!-- http://purl.obolibrary.org/obo/NCBITaxon_subfamily -->

    <owl:Class rdf:about="http://purl.obolibrary.org/obo/NCBITaxon_subfamily">
        <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/NCBITaxon#_taxonomic_rank"/>
        <oboInOwl:hasOBONamespace rdf:datatype="http://www.w3.org/2001/XMLSchema#string">ncbi_taxonomy</oboInOwl:hasOBONamespace>
        <oboInOwl:id rdf:datatype="http://www.w3.org/2001/XMLSchema#string">NCBITaxon:subfamily</oboInOwl:id>
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">subfamily</rdfs:label>
    </owl:Class>

This is from umls-nci.ttl:

<http://purl.bioontology.org/ontology/NCI/TARGET> a owl:Class ;
        skos:prefLabel """t(8;21)"""@en ;
        skos:notation """TARGET"""^^xsd:string ;
        skos:altLabel """Hyperdiploid; Status of 4 and 10 Unknown"""@en , """iAMP21"""@en , """inv(16)"""@en ;
        UMLS:has_cui """C3897139"""^^xsd:string ;
        UMLS:has_cui """C3897144"""^^xsd:string ;
        UMLS:has_cui """C4086503"""^^xsd:string ;
        UMLS:has_cui """C4086524"""^^xsd:string ;
        UMLS:has_tui """T049"""^^xsd:string ;
        UMLS:has_sty <http://purl.bioontology.org/ontology/STY/T049> ;

@saramsey Do you draw the same conclusion?

ecwood commented 3 years ago

It certainly looks like owl:DatatypeProperty's are being added as nodes: (from umls-nci.ttl)

<http://purl.bioontology.org/ontology/NCI/GENE_ENCODES_PRODUCT> a owl:DatatypeProperty ;
        rdfs:label """GENE ENCODES PRODUCT""";
        rdfs:comment """Gene Encodes Product""" .

From Neo4j:

{
  "iri": "https://identifiers.org/ncit:GENE_ENCODES_PRODUCT",
  "category_label": "information_content_entity",
  "deprecated": "False",
  "name": "Gene encodes product",
  "description": "COMMENTS: Gene Encodes Product",
  "provided_by": "UMLS_STY:",
  "id": "NCIT:GENE_ENCODES_PRODUCT",
  "category": "biolink:InformationContentEntity",
  "update_date": "2019"
}

The IRI doesn't resolve:

INVALID resolution request for 'ncit:GENE_ENCODES_PRODUCT', due to 'Resolution request 'ncit:GENE_ENCODES_PRODUCT' is NOT ABOUT A NAMESPACE; For namespace 'ncit', provided local ID 'GENE_ENCODES_PRODUCT' DOES NOT MATCH local IDs definition pattern '^C\d+$''

ecwood commented 3 years ago

In KG2.5.2C, it is clustered with "OBI:0001617" which is classified as 'biolink:Gene' but has name 'Pubmed ID'

This is because of https://github.com/RTXteam/RTX/blob/adb30783fbd7ae09d86b01c5031aa52bb113b1a1/code/kg2/curies-to-categories.yaml#L14 which classifies everything in HGNC as a gene.

chunyuma commented 3 years ago

Thanks @ericawood!

Except for the node type I mentioned above, the node types below might also have this problem: (I found them by using DSL query: match (n) where not (split(n.id,":")[1] contains "0" or split(n.id,":")[1] contains "1" or split(n.id,":")[1] contains "2" or split(n.id,":")[1] contains "3" or split(n.id,":")[1] contains "4" or split(n.id,":")[1] contains "5" or split(n.id,":")[1] contains "6" or split(n.id,":")[1] contains "7" or split(n.id,":")[1] contains "8" or split(n.id,":")[1] contains "9" ) return distinct n.category. It is basically based on my doubt that the curie ids with the pattern of source not followed by a value might be some invalid curie ids. )

"biolink:InformationContentEntity"
"biolink:OntologyClass"
"biolink:NamedThing"
"biolink:ChemicalSubstance"
"biolink:Procedure"
"biolink:AnatomicalEntity"
"biolink:BiologicalEntity"
"biolink:Device"
"biolink:PhenotypicFeature"
"biolink:PhysicalEntity"

For example:

n.id n.name n.category
"OMIM:has_phenotype" "Has phenotype" "biolink:PhenotypicFeature"
"OMIM:has_allelic_variant" "Has allelic variant" "biolink:PhenotypicFeature"
"OMIM:has_manifestation" "Has manifestation" "biolink:PhenotypicFeature"
"OMIM:MIMTYPE" "OMIM Entry Type" "biolink:PhenotypicFeature"
"OMIM:MIMTYPEMEANING" "Mimtypemeaning" "biolink:PhenotypicFeature"
"OMIM:GENELOCUS" "Gene Locus" "biolink:PhenotypicFeature"
"OMIM:manifestation_of" "Manifestation of" "biolink:PhenotypicFeature"
"OMIM:MIMTYPEVALUE" "OMIM MimType Value" "biolink:PhenotypicFeature"
"OMIM:MOVED_FROM" "Moved from" "biolink:PhenotypicFeature"
"OMIM:has_inheritance_type" "Has inheritance type" "biolink:PhenotypicFeature"
"OMIM:allelic_variant_of" "Allelic variant of" "biolink:PhenotypicFeature"
"OMIM:GENESYMBOL" "Gene Symbol" "biolink:PhenotypicFeature"
"OMIM:phenotype_of" "Phenotype of" "biolink:PhenotypicFeature"

Not sure if the following curie ids are normal for biolink:AnatomicalEntity but something like channel for or site_of is like predicate type.

n.id n.name n.category
"NCIT:SENTINEL" "Nose Swab" "biolink:AnatomicalEntity"
"UBERON:channel_for" "channel for" "biolink:AnatomicalEntity"
"UBERON:transitively_anteriorly_connected_to" "transitively anteriorly connected to" "biolink:AnatomicalEntity"
"UBERON:conduit_for" "conduit for" "biolink:AnatomicalEntity"
"UBERON:filtered_through" "filtered through" "biolink:AnatomicalEntity"
"UBERON:trunk_part_of" "trunk_part_of" "biolink:AnatomicalEntity"
"CL:LATIN" "latin term" "biolink:AnatomicalEntity"
"UBERON:indirectly_supplies" "indirectly_supplies" "biolink:AnatomicalEntity"
"UBERON:protects" "protects" "biolink:AnatomicalEntity"
"UBERON:transitively_distally_connected_to" "transitively distally connected to" "biolink:AnatomicalEntity"
"UBERON:synapsed_by" "synapsed by" "biolink:AnatomicalEntity"
"UBERON:site_of" "site_of" "biolink:AnatomicalEntity"
saramsey commented 3 years ago

I concur with @ericawood that the node ID HGNC:PMID is probably an owl:DatatypeProperty that got turned into a node.

saramsey commented 3 years ago

Now, in the case of UBERON:site_of, the issue there is just that the node has the wrong category. It should be biolink:RelationshipType. I am not sure if UBERON:site_of is occurring somewhere as an owl:DatatypeProperty; (I don't think it should be a datatype property); I would need to do some checking to be sure.

saramsey commented 3 years ago

I concur with @ericawood; categorizing everything with the CURIE prefix HGNC as biolink:Gene is problematic; see RTXteam/RTX#1170. I believe that @ericawood is working on a fix in which owl:DatatypeProperty associations can be read and understood. But in the case of HGNC:PMID, the issue is simply that it should not be a node in the first place, because it should probably be handled via the publications slot of the subject node for the owl:DatatypeProperty.

ecwood commented 3 years ago

I think I've found a way (without having to rely on any fix to ontobio) to filter out these owl:DatatypeProperty nodes. You will see in the examples below that they are all categorized as "type": "PROPERTY". I will investigate more, but I wanted to post these findings. One potential problem with addressing this is that relation nodes (eg. RO:0000053) will be filtered out. Is this a problem?

From OMIM:

{
        "id" : "http://purl.bioontology.org/ontology/OMIM/MIMTYPE",
        "meta" : {
                "comments" : [ "OMIM Entry Type" ]
        },
        "type" : "PROPERTY",
        "lbl" : "OMIM Entry Type"
 }
{
        "id" : "http://purl.bioontology.org/ontology/OMIM/MIMTYPEMEANING",
        "meta" : {
                "comments" : [ "OMIM MimType Meaning" ]
        },
        "type" : "PROPERTY",
        "lbl" : "MIMTYPEMEANING"
}

From HGNC:

{
        "id" : "http://purl.bioontology.org/ontology/HGNC/PMID",
        "meta" : {
                "comments" : [ "Pubmed ID" ]
        },
        "type" : "PROPERTY",
        "lbl" : "Pubmed ID"
}
{
        "id" : "https://identifiers.org/umls:has_sty",
        "meta" : {
                "comments" : [ "Semantic type UMLS property" ]
        },
        "type" : "PROPERTY",
        "lbl" : "Semantic type UMLS property"
}
{
        "id" : "http://purl.bioontology.org/ontology/HGNC/ENSEMBLGENE_ID",
        "meta" : {
                "comments" : [ "Ensembl gene ID" ]
        },
        "type" : "PROPERTY",
        "lbl" : "Ensembl gene ID"
}
{
        "id" : "http://purl.bioontology.org/ontology/HGNC/LOCUS_GROUP",
        "meta" : {
                "comments" : [ "Locus group" ]
        },
        "type" : "PROPERTY",
        "lbl" : "Locus group"
}

From Uberon:

{
       "id" : "http://purl.obolibrary.org/obo/uberon/core#indirectly_supplies",
       "meta" : {
                "definition" : {
                        "val" : "a indirectly_supplies s iff a has a branch and the branch supplies or indirectly supplies s",
                        "xrefs" : [ ]
                },
                "basicPropertyValues" : [ {
                        "pred" : "http://purl.obolibrary.org/obo/IAO_0000116",
                        "val" : "add to RO"
                }, {
                        "pred" : "http://www.geneontology.org/formats/oboInOwl#hasOBONamespace",
                        "val" : "uberon"
                } ]
       },
       "type" : "PROPERTY",
       "lbl" : "indirectly_supplies"
     }
{
        "id" : "http://purl.obolibrary.org/obo/uberon/core#transitively_anteriorly_connected_to",
        "meta" : {
                "definition" : {
                        "val" : ".",
                        "xrefs" : [ "http://purl.obolibrary.org/obo/uberon/docs/Connectivity-Design-Pattern" ]
                },
                "basicPropertyValues" : [ {
                        "pred" : "http://www.geneontology.org/formats/oboInOwl#hasOBONamespace",
                        "val" : "uberon"
                } ]
        },
        "type" : "PROPERTY",
        "lbl" : "transitively anteriorly connected to"
}
ecwood commented 3 years ago

Regarding my previous comment, I noticed that this code already exists in multi_ont_to_json_kg.py: https://github.com/RTXteam/RTX/blob/78b8565ed70de882796f25d948bf18524728bf7b/code/kg2/multi_ont_to_json_kg.py#L725-L728

The problem is that only nodes without a category label are handled by that code block. Per this code: https://github.com/RTXteam/RTX/blob/33b50ae7ce9c42b8dbc19fc5be86990a4b38cfbc/code/kg2/curies-to-categories.yaml#L2-L35 nodes from many of the sources listed above (including OMIM, HGNC, and UBERON) never reach that code block. I am thinking of removing the if node_category_label is None: requirement. @saramsey does that seem reasonable?

saramsey commented 3 years ago

Regarding my previous comment, I noticed that this code already exists in multi_ont_to_json_kg.py:

https://github.com/RTXteam/RTX/blob/78b8565ed70de882796f25d948bf18524728bf7b/code/kg2/multi_ont_to_json_kg.py#L725-L728

The problem is that only nodes without a category label are handled by that code block. Per this code:

https://github.com/RTXteam/RTX/blob/33b50ae7ce9c42b8dbc19fc5be86990a4b38cfbc/code/kg2/curies-to-categories.yaml#L2-L35

nodes from many of the sources listed above (including OMIM, HGNC, and UBERON) never reach that code block. I am thinking of removing the if node_category_label is None: requirement. @saramsey does that seem reasonable?

Seems reasonable. I think this is a good example of where a test build (to sanity check) would be helpful. One build with the change, and one without. Can then compare.

saramsey commented 3 years ago

Outstanding sleuthing, @ericawood !

ecwood commented 3 years ago

I only tested it on biolink-model.owl.ttl, umls-hgnc.ttl, and umls-omim.ttl, but it does appear that that fix introduced an unintended bug do to the following line. Essentially, the source for any of these PROPERTY nodes is now UMLS_STY. https://github.com/RTXteam/RTX/blob/36699fb0285c261ec2adfac6436dabddfaccc9e2/code/kg2/multi_ont_to_json_kg.py#L805-L806

(when viewing that line, please remember than BIOLINK_CATEGORY_ATTRIBUTE is now "information content entity" per 2f48bb6)

Here is what the old HGNC:PMID node looked like:

          {
              "category": "biolink:Gene",
              "category_label": "gene",
              "creation_date": null,
              "deprecated": false,
              "description": "COMMENTS: Pubmed ID",
              "full_name": null,
              "id": "HGNC:PMID",
              "iri": "https://identifiers.org/hgnc:PMID",
              "name": "Pubmed ID",
              "provided_by": "umls_source:HGNC",
              "publications": [],
              "replaced_by": null,
              "synonym": [],
              "update_date": "2019"
          },

Here is what the new HGNC:PMID node looks like. Note that it's provided_by field is UMLS_STY: rather than umls_source:HGNC as it was before.

          {
              "category": "biolink:InformationContentEntity",
              "category_label": "information_content_entity",
              "creation_date": null,
              "deprecated": false,
              "description": "COMMENTS: Pubmed ID",
              "full_name": null,
              "id": "HGNC:PMID",
              "iri": "https://identifiers.org/hgnc:PMID",
              "name": "Pubmed ID",
              "provided_by": "UMLS_STY:",
              "publications": [],
              "replaced_by": null,
              "synonym": [],
              "update_date": "2019"
          },

In addition, nodes from the biolink-model.owl.ttl files that were previously biolink:OntologyClass's are now biolink:InformationContentEntity's.

@saramsey What are your thoughts on this?

saramsey commented 3 years ago

What happens if you comment out L805-806?

saramsey commented 3 years ago

In addition, nodes from the biolink-model.owl.ttl files that were previously biolink:OntologyClass's are now biolink:InformationContentEntity's.

Actually I think this is a good thing. I just checked and biolink:OntologyClass is actually a mixin https://github.com/biolink/biolink-model/blob/bd3607404bae3677bc8fa6de16067c8abfab56b6/biolink-model.yaml#L4869

so it is best if we do not use it. I think biolink:InformationContentEntity is a good substitute to use. Nice work!

ecwood commented 3 years ago

What happens if you comment out L805-806?

This appeared to fix that issue:

          {
              "category": "biolink:InformationContentEntity",
              "category_label": "information_content_entity",
              "creation_date": null,
              "deprecated": false,
              "description": "COMMENTS: Pubmed ID",
              "full_name": null,
              "id": "HGNC:PMID",
              "iri": "https://identifiers.org/hgnc:PMID",
              "name": "Pubmed ID",
              "provided_by": "umls_source:HGNC",
              "publications": [],
              "replaced_by": null,
              "synonym": [],
              "update_date": "2019"
          },

I'll commit the change shortly.

ecwood commented 3 years ago

This looks mostly but not all fixed in KG2.6.0:

match (n) where n.id in ["HGNC:PMID", "NCBITaxon:subclass", "NCBITaxon:has_rank", "NCBITaxon:in_part", "NCBITaxon:infraorder", "NCBITaxon:subfamily", "NCBITaxon:genbank_common_name", "NCBITaxon:misnomer", "NCBITaxon:superkingdom", "NCBITaxon:RANK", "NCBITaxon:DIV", "NCIT:MSTS", "NCIT:CDNH", "NCIT:HNH", "NCIT:SPAAT", "NCIT:ePRO", "NCIT:BBPS", "NCIT:CPTAC", "NCIT:MPSImP", "NCIT:BIRADS", "RXNORM:contained_in", "RXNORM:RXN_BN_CARDINALITY", "RXNORM:RXN_STRENGTH", "RXNORM:ingredient_of", "RXNORM:RXN_QUALITATIVE_DISTINCTION", "RXNORM:precise_ingredient_of", "RXNORM:RXN_BOSS_AM", "RXNORM:RXN_BOSS_AI", "RXNORM:RXN_BOSS_FROM", "NDDF:FL", "NCIT:TARGET", "NCIT:Alliance", "ICD10:CODE_ALSO", "ICD10:ORDER_NO", "ICD10:NOTE", "ICD10:CODE_FIRST", "ICD10:SIB", "ICD10:USE_ADDITIONAL", "PR:PRO-common-name", "PR:PRO-proteoform-ftid", "PR:PRO-proteoform-std", "PR:lacks_part", "PR:has_gene_template", "PR:PSI-MOD-label", "HGNC:GENESYMBOL", "HGNC:MAPPED_UCSC_ID", "HGNC:LOCUS_GROUP", "HGNC:EZ", "HGNC:ENTREZGENE_ID", "HGNC:PREV_SYMBOL", "HGNC:OMIM_ID", "HGNC:DATE_NAME_CHANGED"] return n.id, n.category_label

n.id n.category_label
"HGNC:PREV_SYMBOL" "information_content_entity"
"HGNC:DATE_NAME_CHANGED" "information_content_entity"
"HGNC:EZ" "information_content_entity"
"HGNC:LOCUS_GROUP" "information_content_entity"
"HGNC:OMIM_ID" "information_content_entity"
"HGNC:MAPPED_UCSC_ID" "information_content_entity"
"HGNC:ENTREZGENE_ID" "information_content_entity"
"HGNC:GENESYMBOL" "information_content_entity"
"HGNC:PMID" "information_content_entity"
"ICD10:USE_ADDITIONAL" "information_content_entity"
"ICD10:SIB" "information_content_entity"
"ICD10:CODE_FIRST" "information_content_entity"
"ICD10:NOTE" "information_content_entity"
"ICD10:ORDER_NO" "information_content_entity"
"ICD10:CODE_ALSO" "information_content_entity"
"NCBITaxon:DIV" "information_content_entity"
"NCBITaxon:RANK" "information_content_entity"
"NCIT:BIRADS" "disease_or_phenotypic_feature"
"NCIT:Alliance" "disease"
"NCIT:MPSImP" "disease_or_phenotypic_feature"
"NCIT:CPTAC" "disease_or_phenotypic_feature"
"NCIT:BBPS" "disease_or_phenotypic_feature"
"NCIT:ePRO" "disease_or_phenotypic_feature"
"NCIT:SPAAT" "disease_or_phenotypic_feature"
"NCIT:CDNH" "disease_or_phenotypic_feature"
"NCIT:HNH" "disease_or_phenotypic_feature"
"NCIT:MSTS" "disease_or_phenotypic_feature"
"NCIT:TARGET" "disease"
"NDDF:FL" "drug"
"RXNORM:RXN_BOSS_FROM" "information_content_entity"
"RXNORM:RXN_BOSS_AI" "information_content_entity"
"RXNORM:RXN_BOSS_AM" "information_content_entity"
"RXNORM:precise_ingredient_of" "information_content_entity"
"RXNORM:RXN_QUALITATIVE_DISTINCTION" "information_content_entity"
"RXNORM:ingredient_of" "information_content_entity"
"RXNORM:RXN_STRENGTH" "information_content_entity"
"RXNORM:RXN_BN_CARDINALITY" "information_content_entity"
"RXNORM:contained_in" "information_content_entity"
"PR:lacks_part" "information_content_entity"
"PR:has_gene_template" "information_content_entity"
"NCBITaxon:subclass" "organism_taxon"
"NCBITaxon:superkingdom" "organism_taxon"
"NCBITaxon:infraorder" "organism_taxon"
"NCBITaxon:in_part" "information_content_entity"
"NCBITaxon:misnomer" "information_content_entity"
"NCBITaxon:genbank_common_name" "information_content_entity"
"NCBITaxon:subfamily" "organism_taxon"
"PR:PRO-proteoform-ftid" "information_content_entity"
"PR:PRO-proteoform-std" "information_content_entity"
"PR:PRO-common-name" "information_content_entity"
"PR:PSI-MOD-label" "information_content_entity"
"NCBITaxon:has_rank" "information_content_entity"