RTXteam / RTX-KG2

Build system for the RTX-KG2 biomedical knowledge graph, part of the ARAX reasoning system (https://github.com/RTXTeam/RTX)
MIT License
36 stars 8 forks source link

Biolink 2.1 compliance #64

Closed saramsey closed 3 years ago

saramsey commented 3 years ago

OK, Biolink 2.0 (6/1 release date) has various changes that will affect KG2:

see https://github.com/biolink/biolink-model/pull/750

Apparently, the deadline for ARAs and KPs to comply with Biolink 2.0 is July 15th. Let's start by assessing how much work we think it will require to implement the above changes, get them rolled into a new build of KG2, and get a new build of the Node Synonymizer, KG2c, the KG2c-backed PloverDB, and the PloverDB-backed RTX-KG2 API.

saramsey commented 3 years ago

Note, Biolink 2.0 is not quite yet released:

Screen Shot 2021-06-02 at 10 31 59 AM
saramsey commented 3 years ago

There is apparently a Migration Guide, which may be helpful: https://github.com/biolink/biolink-model/blob/chem_vlado_curation/Migration_2.0_Guide.md

Screen Shot 2021-06-02 at 10 42 43 AM
kvarforl commented 3 years ago

Looking for instances of categories that will need to be remapped in curies-to-categories.yaml:

Cypher query with links to more info about each:

match (n) where n.id in ["NCIT:C16612","ORPHANET:410297","SO:0000704","UMLS_STY:T028","UMLS_STY:T086","CHEBI:78295","UMLS_STY:T088","CHEBI:25212"] return n.id, n.iri
kvarforl commented 3 years ago

The ChemicalSubstance reorg is a bit confusing to me, and will probably require some discussion / assistance, but here are the current mappings again from curies-to-categories.yaml:

The distinction between these two instructions is a bit lost on me:

SmallMolecule takes all ID prefixes from ChemicalSubstance ChemicalEntity takes almost all mappings from ChemicalSubstance

Do we want to go this route?

Chris M recommends keeping old version of ChemicalSubstance but deprecating it (and adding warning that you should use ChemicalEntity instead)

If so, should we just blanket change the node deprecation property on ChemicalSubstance nodes to deprecated? (this seems suboptimal, since the concept itself isn't deprecated, just our categorizing of it)

kvarforl commented 3 years ago

Something else to check for and possibly change in predicate-remap.yaml: Use canonical predicate directions (do not use “inverse” predicates)

Probably start by updating validation script and go from there

kvarforl commented 3 years ago

An additional note: it seems like a good idea (per eric's suggestion) to do these changes on a branch, so we can more easily verify fixes for #62!

ecwood commented 3 years ago

For the chemistry reorganization, I found a PowerPoint that might offer some understanding of what their goal with reorganizing chemical substance:

https://drive.google.com/file/d/1NbJO_vZyCVRrI5EmQPVWo56HHNMEbtR6/view

ecwood commented 3 years ago

Based on https://www.ebi.ac.uk/training/online/courses/metabolomics-introduction/what-is/small-molecules/, it looks like metabolite should map to small molecule. In addition, according to the biolink YAML file, a metabolite is:

    description: >-
                A small molecule entity is a molecular entity characterized by availability
                in small-molecule databases of SMILES, InChI, IUPAC, or other
                unambiguous representation of its precise chemical structure; for
                convenience of representation, any valid chemical representation is
                included, even if it is not strictly molecular (e.g., sodium ion).</span>

Within HMDB (a metabolite database), the metabolites have a SMILES and InChI representation.

ecwood commented 3 years ago

Here's the full Biolink 2.0 category tree in the form {category: [subcategory 1, subcategory 2, subcategory 3, etc]:

{
    "named thing": [
        "activity",
        "administrative entity",
        "biological entity",
        "chemical entity",
        "clinical entity",
        "device",
        "event",
        "information content entity",
        "organism taxon",
        "phenomenon",
        "physical entity",
        "planetary entity",
        "procedure",
        "treatment"
    ],
    "chemical entity": [
        "chemical exposure",
        "chemical mixture",
        "environmental food contaminant",
        "food additive",
        "molecular entity",
        "nutrient"
    ],
    "clinical entity": [
        "clinical intervention",
        "clinical trial"
    ],
    "organism taxon": [
        "biotic exposure"
    ],
    "biological entity": [
        "biological process or activity",
        "disease or phenotypic feature",
        "epidemiological outcome",
        "genome",
        "genomic background exposure",
        "genotype",
        "haplotype",
        "organismal entity"
    ],
    "administrative entity": [
        "agent"
    ],
    "planetary entity": [
        "environmental feature",
        "environmental process",
        "geographic location"
    ],
    "information content entity": [
        "confidence level",
        "dataset",
        "dataset distribution",
        "dataset summary",
        "dataset version",
        "evidence type",
        "information resource",
        "publication"
    ],
    "physical entity": [
        "material sample"
    ],
    "molecular entity": [
        "nucleic acid entity",
        "polypeptide",
        "small molecule"
    ],
    "chemical exposure": [
        "complex chemical exposure"
    ],
    "nutrient": [
        "macronutrient",
        "micronutrient"
    ],
    "chemical mixture": [
        "complex molecular mixture",
        "food",
        "molecular mixture",
        "processed material"
    ],
    "clinical intervention": [
        "hospitalization"
    ],
    "biological process or activity": [
        "biological process",
        "molecular activity"
    ],
    "organismal entity": [
        "anatomical entity",
        "cell line",
        "individual organism",
        "life stage",
        "population of individual organisms"
    ],
    "disease or phenotypic feature": [
        "disease",
        "disease or phenotypic feature exposure",
        "disease or phenotypic feature outcome",
        "phenotypic feature"
    ],
    "environmental process": [
        "environmental exposure"
    ],
    "geographic location": [
        "geographic exposure",
        "geographic location at time"
    ],
    "publication": [
        "article",
        "book",
        "book chapter",
        "serial"
    ],
    "nucleic acid entity": [
        "coding sequence",
        "exon",
        "gene",
        "gene family",
        "reagent targeted gene",
        "sequence variant",
        "transcript"
    ],
    "polypeptide": [
        "protein"
    ],
    "micronutrient": [
        "vitamin"
    ],
    "molecular mixture": [
        "drug"
    ],
    "hospitalization": [
        "hospitalization outcome"
    ],
    "biological process": [
        "behavior",
        "death",
        "pathological process",
        "pathway",
        "physiological process"
    ],
    "individual organism": [
        "case"
    ],
    "anatomical entity": [
        "cell",
        "cellular component",
        "gross anatomical structure",
        "pathological anatomical structure"
    ],
    "population of individual organisms": [
        "study population"
    ],
    "phenotypic feature": [
        "behavioral feature",
        "clinical finding"
    ],
    "transcript": [
        "RNA product"
    ],
    "sequence variant": [
        "snv"
    ],
    "protein": [
        "protein isoform"
    ],
    "drug": [
        "drug exposure"
    ],
    "behavior": [
        "behavioral exposure",
        "behavioral outcome",
        "socioeconomic exposure",
        "socioeconomic outcome"
    ],
    "death": [
        "mortality outcome"
    ],
    "pathological process": [
        "pathological process exposure",
        "pathological process outcome"
    ],
    "pathological anatomical structure": [
        "pathological anatomical exposure",
        "pathological anatomical outcome"
    ],
    "study population": [
        "cohort"
    ],
    "RNA product": [
        "RNA product isoform",
        "noncoding RNA product"
    ],
    "drug exposure": [
        "drug to gene interaction exposure"
    ],
    "noncoding RNA product": [
        "microRNA",
        "siRNA"
    ]
}
ecwood commented 3 years ago

Biolink 2.0 Ontology Nodes:

[
    "biolink:Activity",
    "biolink:ActivityAndBehavior",
    "biolink:AdministrativeEntity",
    "biolink:Agent",
    "biolink:AnatomicalEntity",
    "biolink:AnatomicalEntityToAnatomicalEntityAssociation",
    "biolink:AnatomicalEntityToAnatomicalEntityOntogenicAssociation",
    "biolink:AnatomicalEntityToAnatomicalEntityPartOfAssociation",
    "biolink:Annotation",
    "biolink:Article",
    "biolink:Association",
    "biolink:Attribute",
    "biolink:Behavior",
    "biolink:BehaviorToBehavioralFeatureAssociation",
    "biolink:BehavioralExposure",
    "biolink:BehavioralFeature",
    "biolink:BehavioralOutcome",
    "biolink:BiologicalEntity",
    "biolink:BiologicalProcess",
    "biolink:BiologicalProcessOrActivity",
    "biolink:BiologicalSequence",
    "biolink:BiologicalSex",
    "biolink:BioticExposure",
    "biolink:Book",
    "biolink:BookChapter",
    "biolink:Case",
    "biolink:CaseToEntityAssociationMixin",
    "biolink:CaseToPhenotypicFeatureAssociation",
    "biolink:CategoryType",
    "biolink:Cell",
    "biolink:CellLine",
    "biolink:CellLineAsAModelOfDiseaseAssociation",
    "biolink:CellLineToDiseaseOrPhenotypicFeatureAssociation",
    "biolink:CellLineToEntityAssociationMixin",
    "biolink:CellularComponent",
    "biolink:ChemicalEntity",
    "biolink:ChemicalEntityToEntityAssociationMixin",
    "biolink:ChemicalExposure",
    "biolink:ChemicalFormulaValue",
    "biolink:ChemicalMixture",
    "biolink:ChemicalOrDrugOrTreatment",
    "biolink:ChemicalSubstance",
    "biolink:ChemicalToChemicalAssociation",
    "biolink:ChemicalToChemicalDerivationAssociation",
    "biolink:ChemicalToDiseaseOrPhenotypicFeatureAssociation",
    "biolink:ChemicalToEntityAssociationMixin",
    "biolink:ChemicalToGeneAssociation",
    "biolink:ChemicalToPathwayAssociation",
    "biolink:ClinicalAttribute",
    "biolink:ClinicalCourse",
    "biolink:ClinicalEntity",
    "biolink:ClinicalFinding",
    "biolink:ClinicalIntervention",
    "biolink:ClinicalMeasurement",
    "biolink:ClinicalModifier",
    "biolink:ClinicalTrial",
    "biolink:CodingSequence",
    "biolink:Cohort",
    "biolink:ComplexChemicalExposure",
    "biolink:ComplexMolecularMixture",
    "biolink:ConfidenceLevel",
    "biolink:ContributorAssociation",
    "biolink:Dataset",
    "biolink:DatasetDistribution",
    "biolink:DatasetSummary",
    "biolink:DatasetVersion",
    "biolink:Death",
    "biolink:Device",
    "biolink:Disease",
    "biolink:DiseaseOrPhenotypicFeature",
    "biolink:DiseaseOrPhenotypicFeatureExposure",
    "biolink:DiseaseOrPhenotypicFeatureOutcome",
    "biolink:DiseaseOrPhenotypicFeatureToEntityAssociationMixin",
    "biolink:DiseaseOrPhenotypicFeatureToLocationAssociation",
    "biolink:DiseaseToEntityAssociationMixin",
    "biolink:DiseaseToExposureEventAssociation",
    "biolink:DiseaseToPhenotypicFeatureAssociation",
    "biolink:Drug",
    "biolink:DrugExposure",
    "biolink:DrugToEntityAssociationMixin",
    "biolink:DrugToGeneAssociation",
    "biolink:DrugToGeneInteractionExposure",
    "biolink:Entity",
    "biolink:EntityToDiseaseAssociationMixin",
    "biolink:EntityToDiseaseOrPhenotypicFeatureAssociationMixin",
    "biolink:EntityToExposureEventAssociationMixin",
    "biolink:EntityToFeatureOrDiseaseQualifiersMixin",
    "biolink:EntityToOutcomeAssociationMixin",
    "biolink:EntityToPhenotypicFeatureAssociationMixin",
    "biolink:EnvironmentalExposure",
    "biolink:EnvironmentalFeature",
    "biolink:EnvironmentalFoodContaminant",
    "biolink:EnvironmentalProcess",
    "biolink:EpidemiologicalOutcome",
    "biolink:Event",
    "biolink:EvidenceType",
    "biolink:Exon",
    "biolink:ExonToTranscriptRelationship",
    "biolink:ExposureEvent",
    "biolink:ExposureEventToEntityAssociationMixin",
    "biolink:ExposureEventToOutcomeAssociation",
    "biolink:ExposureEventToPhenotypicFeatureAssociation",
    "biolink:Food",
    "biolink:FoodAdditive",
    "biolink:FrequencyQualifierMixin",
    "biolink:FrequencyQuantifier",
    "biolink:FrequencyValue",
    "biolink:FunctionalAssociation",
    "biolink:Gene",
    "biolink:GeneAsAModelOfDiseaseAssociation",
    "biolink:GeneExpressionMixin",
    "biolink:GeneFamily",
    "biolink:GeneGroupingMixin",
    "biolink:GeneHasVariantThatContributesToDiseaseAssociation",
    "biolink:GeneOntologyClass",
    "biolink:GeneOrGeneProduct",
    "biolink:GeneProductIsoformMixin",
    "biolink:GeneProductMixin",
    "biolink:GeneRegulatoryRelationship",
    "biolink:GeneToDiseaseAssociation",
    "biolink:GeneToEntityAssociationMixin",
    "biolink:GeneToExpressionSiteAssociation",
    "biolink:GeneToGeneAssociation",
    "biolink:GeneToGeneCoexpressionAssociation",
    "biolink:GeneToGeneHomologyAssociation",
    "biolink:GeneToGeneProductRelationship",
    "biolink:GeneToGoTermAssociation",
    "biolink:GeneToPhenotypicFeatureAssociation",
    "biolink:Genome",
    "biolink:GenomicBackgroundExposure",
    "biolink:GenomicEntity",
    "biolink:GenomicSequenceLocalization",
    "biolink:Genotype",
    "biolink:GenotypeAsAModelOfDiseaseAssociation",
    "biolink:GenotypeToDiseaseAssociation",
    "biolink:GenotypeToEntityAssociationMixin",
    "biolink:GenotypeToGeneAssociation",
    "biolink:GenotypeToGenotypePartAssociation",
    "biolink:GenotypeToPhenotypicFeatureAssociation",
    "biolink:GenotypeToVariantAssociation",
    "biolink:GenotypicSex",
    "biolink:GeographicExposure",
    "biolink:GeographicLocation",
    "biolink:GeographicLocationAtTime",
    "biolink:GrossAnatomicalStructure",
    "biolink:Haplotype",
    "biolink:Hospitalization",
    "biolink:HospitalizationOutcome",
    "biolink:IndividualOrganism",
    "biolink:InformationContentEntity",
    "biolink:InformationResource",
    "biolink:Inheritance",
    "biolink:IriType",
    "biolink:LabelType",
    "biolink:LifeStage",
    "biolink:MacromolecularComplexMixin",
    "biolink:MacromolecularMachineMixin",
    "biolink:MacromolecularMachineToBiologicalProcessAssociation",
    "biolink:MacromolecularMachineToCellularComponentAssociation",
    "biolink:MacromolecularMachineToEntityAssociationMixin",
    "biolink:MacromolecularMachineToMolecularActivityAssociation",
    "biolink:Macronutrient",
    "biolink:MaterialSample",
    "biolink:MaterialSampleDerivationAssociation",
    "biolink:MaterialSampleToDiseaseOrPhenotypicFeatureAssociation",
    "biolink:MaterialSampleToEntityAssociationMixin",
    "biolink:MicroRNA",
    "biolink:Micronutrient",
    "biolink:ModelToDiseaseAssociationMixin",
    "biolink:MolecularActivity",
    "biolink:MolecularEntity",
    "biolink:MolecularMixture",
    "biolink:MortalityOutcome",
    "biolink:NamedThing",
    "biolink:NamedThingToInformationContentEntityAssociation",
    "biolink:NarrativeText",
    "biolink:NoncodingRNAProduct",
    "biolink:NucleicAcidEntity",
    "biolink:Nutrient",
    "biolink:Occurrent",
    "biolink:Onset",
    "biolink:OntologyClass",
    "biolink:OrganismAttribute",
    "biolink:OrganismTaxon",
    "biolink:OrganismTaxonToEntityAssociation",
    "biolink:OrganismTaxonToEnvironmentAssociation",
    "biolink:OrganismTaxonToOrganismTaxonAssociation",
    "biolink:OrganismTaxonToOrganismTaxonInteraction",
    "biolink:OrganismTaxonToOrganismTaxonSpecialization",
    "biolink:OrganismToOrganismAssociation",
    "biolink:OrganismalEntity",
    "biolink:OrganismalEntityAsAModelOfDiseaseAssociation",
    "biolink:Outcome",
    "biolink:PairwiseGeneToGeneInteraction",
    "biolink:PairwiseMolecularInteraction",
    "biolink:PathognomonicityQuantifier",
    "biolink:PathologicalAnatomicalExposure",
    "biolink:PathologicalAnatomicalOutcome",
    "biolink:PathologicalAnatomicalStructure",
    "biolink:PathologicalEntityMixin",
    "biolink:PathologicalProcess",
    "biolink:PathologicalProcessExposure",
    "biolink:PathologicalProcessOutcome",
    "biolink:Pathway",
    "biolink:PercentageFrequencyValue",
    "biolink:Phenomenon",
    "biolink:PhenotypicFeature",
    "biolink:PhenotypicQuality",
    "biolink:PhenotypicSex",
    "biolink:PhysicalEntity",
    "biolink:PhysicalEssence",
    "biolink:PhysicalEssenceOrOccurrent",
    "biolink:PhysiologicalProcess",
    "biolink:PlanetaryEntity",
    "biolink:Polypeptide",
    "biolink:PopulationOfIndividualOrganisms",
    "biolink:PopulationToPopulationAssociation",
    "biolink:PredicateType",
    "biolink:Procedure",
    "biolink:ProcessedMaterial",
    "biolink:Protein",
    "biolink:ProteinIsoform",
    "biolink:Publication",
    "biolink:QuantityValue",
    "biolink:Quotient",
    "biolink:RNAProduct",
    "biolink:RNAProductIsoform",
    "biolink:ReactionToCatalystAssociation",
    "biolink:ReactionToParticipantAssociation",
    "biolink:ReagentTargetedGene",
    "biolink:RelationshipQuantifier",
    "biolink:RelationshipType",
    "biolink:SensitivityQuantifier",
    "biolink:SequenceAssociation",
    "biolink:SequenceFeatureRelationship",
    "biolink:SequenceVariant",
    "biolink:SequenceVariantModulatesTreatmentAssociation",
    "biolink:Serial",
    "biolink:SeverityValue",
    "biolink:SiRNA",
    "biolink:SmallMolecule",
    "biolink:Snv",
    "biolink:SocioeconomicAttribute",
    "biolink:SocioeconomicExposure",
    "biolink:SocioeconomicOutcome",
    "biolink:SpecificityQuantifier",
    "biolink:StudyPopulation",
    "biolink:SubjectOfInvestigation",
    "biolink:SymbolType",
    "biolink:TaxonToTaxonAssociation",
    "biolink:TaxonomicRank",
    "biolink:ThingWithTaxon",
    "biolink:TimeType",
    "biolink:Transcript",
    "biolink:TranscriptToGeneRelationship",
    "biolink:Treatment",
    "biolink:UnclassifiedOntologyClass",
    "biolink:Unit",
    "biolink:VariantAsAModelOfDiseaseAssociation",
    "biolink:VariantToDiseaseAssociation",
    "biolink:VariantToEntityAssociationMixin",
    "biolink:VariantToGeneAssociation",
    "biolink:VariantToGeneExpressionAssociation",
    "biolink:VariantToPhenotypicFeatureAssociation",
    "biolink:VariantToPopulationAssociation",
    "biolink:Vitamin",
    "biolink:Zygosity",
    "biolink:abundance_affected_by",
    "biolink:abundance_decreased_by",
    "biolink:abundance_increased_by",
    "biolink:active_in",
    "biolink:actively_involved_in",
    "biolink:actively_involves",
    "biolink:activity_affected_by",
    "biolink:activity_affects",
    "biolink:activity_decreased_by",
    "biolink:activity_increased_by",
    "biolink:acts_upstream_of",
    "biolink:acts_upstream_of_negative_effect",
    "biolink:acts_upstream_of_or_within",
    "biolink:acts_upstream_of_or_within_negative_effect",
    "biolink:acts_upstream_of_or_within_positive_effect",
    "biolink:acts_upstream_of_positive_effect",
    "biolink:address",
    "biolink:adverse_event_caused_by",
    "biolink:affected_by",
    "biolink:affects",
    "biolink:affects_abundance_of",
    "biolink:affects_activity_of",
    "biolink:affects_degradation_of",
    "biolink:affects_expression_in",
    "biolink:affects_expression_of",
    "biolink:affects_folding_of",
    "biolink:affects_localization_of",
    "biolink:affects_metabolic_processing_of",
    "biolink:affects_molecular_modification_of",
    "biolink:affects_mutation_rate_of",
    "biolink:affects_response_to",
    "biolink:affects_risk_for",
    "biolink:affects_secretion_of",
    "biolink:affects_splicing_of",
    "biolink:affects_stability_of",
    "biolink:affects_synthesis_of",
    "biolink:affects_transport_of",
    "biolink:affects_uptake_of",
    "biolink:affiliation",
    "biolink:aggregate_statistic",
    "biolink:aggregator_knowledge_source",
    "biolink:ameliorates",
    "biolink:approved_for_treatment_by",
    "biolink:approved_to_treat",
    "biolink:associated_environmental_context",
    "biolink:association_slot",
    "biolink:association_type",
    "biolink:author",
    "biolink:authors",
    "biolink:base_coordinate",
    "biolink:biological_role_mixin",
    "biolink:biomarker_for",
    "biolink:broad_match",
    "biolink:capable_of",
    "biolink:catalyst_qualifier",
    "biolink:catalyzes",
    "biolink:category",
    "biolink:caused_by",
    "biolink:causes",
    "biolink:causes_adverse_event",
    "biolink:chapter",
    "biolink:chemical_role_mixin",
    "biolink:chemically_interacts_with",
    "biolink:chemically_similar_to",
    "biolink:chi_squared_statistic",
    "biolink:clinical_modifier_qualifier",
    "biolink:close_match",
    "biolink:coexists_with",
    "biolink:coexpressed_with",
    "biolink:colocalizes_with",
    "biolink:completed_by",
    "biolink:condition_associated_with_gene",
    "biolink:consumed_by",
    "biolink:consumes",
    "biolink:contraindicated_for",
    "biolink:contributes_to",
    "biolink:contribution_from",
    "biolink:contributor",
    "biolink:correlated_with",
    "biolink:created_with",
    "biolink:creation_date",
    "biolink:dataset_download_url",
    "biolink:decreased_amount_in",
    "biolink:decreases_abundance_of",
    "biolink:decreases_activity_of",
    "biolink:decreases_degradation_of",
    "biolink:decreases_expression_of",
    "biolink:decreases_folding_of",
    "biolink:decreases_localization_of",
    "biolink:decreases_metabolic_processing_of",
    "biolink:decreases_molecular_interaction",
    "biolink:decreases_molecular_modification_of",
    "biolink:decreases_mutation_rate_of",
    "biolink:decreases_response_to",
    "biolink:decreases_secretion_of",
    "biolink:decreases_splicing_of",
    "biolink:decreases_stability_of",
    "biolink:decreases_synthesis_of",
    "biolink:decreases_transport_of",
    "biolink:decreases_uptake_of",
    "biolink:degradation_affected_by",
    "biolink:degradation_decreased_by",
    "biolink:degradation_increased_by",
    "biolink:derives_from",
    "biolink:derives_into",
    "biolink:description",
    "biolink:develops_from",
    "biolink:develops_into",
    "biolink:directly_interacts_with",
    "biolink:disease_has_basis_in",
    "biolink:disrupted_by",
    "biolink:disrupts",
    "biolink:distribution_download_url",
    "biolink:download_url",
    "biolink:editor",
    "biolink:enabled_by",
    "biolink:enables",
    "biolink:end_coordinate",
    "biolink:end_interbase_coordinate",
    "biolink:entity_negatively_regulated_by_entity",
    "biolink:entity_negatively_regulates_entity",
    "biolink:entity_positively_regulated_by_entity",
    "biolink:entity_positively_regulates_entity",
    "biolink:entity_regulated_by_entity",
    "biolink:entity_regulates_entity",
    "biolink:exacerbates",
    "biolink:exact_match",
    "biolink:expressed_in",
    "biolink:expresses",
    "biolink:expression_affected_by",
    "biolink:expression_decreased_by",
    "biolink:expression_increased_by",
    "biolink:expression_site",
    "biolink:filler",
    "biolink:folding_affected_by",
    "biolink:folding_decreased_by",
    "biolink:folding_increased_by",
    "biolink:food_component_of",
    "biolink:format",
    "biolink:frequency_qualifier",
    "biolink:full_name",
    "biolink:gene_associated_with_condition",
    "biolink:gene_product_of",
    "biolink:genetic_association",
    "biolink:genetically_interacts_with",
    "biolink:genome_build",
    "biolink:has_active_ingredient",
    "biolink:has_attribute",
    "biolink:has_attribute_type",
    "biolink:has_biological_sequence",
    "biolink:has_biomarker",
    "biolink:has_chemical_formula",
    "biolink:has_completed",
    "biolink:has_confidence_level",
    "biolink:has_constituent",
    "biolink:has_contraindication",
    "biolink:has_count",
    "biolink:has_dataset",
    "biolink:has_decreased_amount",
    "biolink:has_device",
    "biolink:has_distribution",
    "biolink:has_drug",
    "biolink:has_evidence",
    "biolink:has_excipient",
    "biolink:has_food_component",
    "biolink:has_frameshift_variant",
    "biolink:has_gene",
    "biolink:has_gene_or_gene_product",
    "biolink:has_gene_product",
    "biolink:has_increased_amount",
    "biolink:has_input",
    "biolink:has_manifestation",
    "biolink:has_metabolite",
    "biolink:has_missense_variant",
    "biolink:has_molecular_consequence",
    "biolink:has_nearby_variant",
    "biolink:has_negative_upstream_actor",
    "biolink:has_negative_upstream_or_within_actor",
    "biolink:has_non_coding_variant",
    "biolink:has_nonsense_variant",
    "biolink:has_not_completed",
    "biolink:has_numeric_value",
    "biolink:has_nutrient",
    "biolink:has_output",
    "biolink:has_part",
    "biolink:has_participant",
    "biolink:has_percentage",
    "biolink:has_phenotype",
    "biolink:has_population_context",
    "biolink:has_positive_upstream_actor",
    "biolink:has_positive_upstream_or_within_actor",
    "biolink:has_procedure",
    "biolink:has_qualitative_value",
    "biolink:has_quantitative_value",
    "biolink:has_quotient",
    "biolink:has_receptor",
    "biolink:has_route",
    "biolink:has_sequence_location",
    "biolink:has_sequence_variant",
    "biolink:has_splice_site_variant",
    "biolink:has_stressor",
    "biolink:has_substrate",
    "biolink:has_synonymous_variant",
    "biolink:has_temporal_context",
    "biolink:has_topic",
    "biolink:has_total",
    "biolink:has_unit",
    "biolink:has_upstream_actor",
    "biolink:has_upstream_or_within_actor",
    "biolink:has_variant_part",
    "biolink:has_zygosity",
    "biolink:homologous_to",
    "biolink:id",
    "biolink:in_cell_population_with",
    "biolink:in_complex_with",
    "biolink:in_linkage_disequilibrium_with",
    "biolink:in_pathway_with",
    "biolink:in_taxon",
    "biolink:increased_amount_of",
    "biolink:increases_abundance_of",
    "biolink:increases_activity_of",
    "biolink:increases_degradation_of",
    "biolink:increases_expression_of",
    "biolink:increases_folding_of",
    "biolink:increases_localization_of",
    "biolink:increases_metabolic_processing_of",
    "biolink:increases_molecular_interaction",
    "biolink:increases_molecular_modification_of",
    "biolink:increases_mutation_rate_of",
    "biolink:increases_response_to",
    "biolink:increases_secretion_of",
    "biolink:increases_splicing_of",
    "biolink:increases_stability_of",
    "biolink:increases_synthesis_of",
    "biolink:increases_transport_of",
    "biolink:increases_uptake_of",
    "biolink:ingest_date",
    "biolink:interacting_molecules_category",
    "biolink:interacts_with",
    "biolink:interbase_coordinate",
    "biolink:iri",
    "biolink:is_active_ingredient_of",
    "biolink:is_catalyst_of",
    "biolink:is_excipient_of",
    "biolink:is_frameshift_variant_of",
    "biolink:is_input_of",
    "biolink:is_metabolite",
    "biolink:is_metabolite_of",
    "biolink:is_missense_variant_of",
    "biolink:is_molecular_consequence_of",
    "biolink:is_nearby_variant_of",
    "biolink:is_non_coding_variant_of",
    "biolink:is_nonsense_variant_of",
    "biolink:is_output_of",
    "biolink:is_sequence_variant_of",
    "biolink:is_splice_site_variant_of",
    "biolink:is_substrate_of",
    "biolink:is_synonymous_variant_of",
    "biolink:iso_abbreviation",
    "biolink:issue",
    "biolink:keywords",
    "biolink:knowledge_source",
    "biolink:lacks_part",
    "biolink:latitude",
    "biolink:license",
    "biolink:localization_affected_by",
    "biolink:localization_decreased_by",
    "biolink:localization_increased_by",
    "biolink:located_in",
    "biolink:location_of",
    "biolink:logical_interpretation",
    "biolink:longitude",
    "biolink:manifestation_of",
    "biolink:mentions",
    "biolink:mesh_terms",
    "biolink:metabolic_processing_affected_by",
    "biolink:metabolic_processing_decreased_by",
    "biolink:metabolic_processing_increased_by",
    "biolink:missing_from",
    "biolink:model_of",
    "biolink:models",
    "biolink:molecular_interaction_decreased_by",
    "biolink:molecular_interaction_increased_by",
    "biolink:molecular_modification_affected_by",
    "biolink:molecular_modification_decreased_by",
    "biolink:molecular_modification_increased_by",
    "biolink:molecularly_interacts_with",
    "biolink:mutation_rate_affected_by",
    "biolink:mutation_rate_decreased_by",
    "biolink:mutation_rate_increased_by",
    "biolink:name",
    "biolink:narrow_match",
    "biolink:negated",
    "biolink:negatively_correlated_with",
    "biolink:negatively_regulated_by",
    "biolink:negatively_regulates",
    "biolink:node_property",
    "biolink:not_completed_by",
    "biolink:nutrient_of",
    "biolink:object",
    "biolink:occurs_in",
    "biolink:onset_qualifier",
    "biolink:opposite_of",
    "biolink:original_knowledge_source",
    "biolink:orthologous_to",
    "biolink:overlaps",
    "biolink:p_value",
    "biolink:pages",
    "biolink:paralogous_to",
    "biolink:part_of",
    "biolink:participates_in",
    "biolink:phase",
    "biolink:phenotype_of",
    "biolink:phenotypic_state",
    "biolink:physically_interacts_with",
    "biolink:positively_correlated_with",
    "biolink:positively_regulated_by",
    "biolink:positively_regulates",
    "biolink:preceded_by",
    "biolink:precedes",
    "biolink:predicate",
    "biolink:predisposes",
    "biolink:prevented_by",
    "biolink:prevents",
    "biolink:primary_knowledge_source",
    "biolink:process_negatively_regulated_by_process",
    "biolink:process_negatively_regulates_process",
    "biolink:process_positively_regulated_by_process",
    "biolink:process_positively_regulates_process",
    "biolink:process_regulated_by_process",
    "biolink:process_regulates_process",
    "biolink:produced_by",
    "biolink:produces",
    "biolink:provided_by",
    "biolink:provider",
    "biolink:publications",
    "biolink:published_in",
    "biolink:publisher",
    "biolink:qualifiers",
    "biolink:quantifier_qualifier",
    "biolink:reaction_balanced",
    "biolink:reaction_direction",
    "biolink:reaction_side",
    "biolink:regulated_by",
    "biolink:regulates",
    "biolink:related_condition",
    "biolink:related_to",
    "biolink:relation",
    "biolink:response_affected_by",
    "biolink:response_decreased_by",
    "biolink:response_increased_by",
    "biolink:retrieved_on",
    "biolink:rights",
    "biolink:risk_affected_by",
    "biolink:same_as",
    "biolink:secretion_affected_by",
    "biolink:secretion_decreased_by",
    "biolink:secretion_increased_by",
    "biolink:sequence_localization_attribute",
    "biolink:sequence_location_of",
    "biolink:sequence_variant_qualifier",
    "biolink:severity_qualifier",
    "biolink:sex_qualifier",
    "biolink:similar_to",
    "biolink:source",
    "biolink:source_logo",
    "biolink:source_web_page",
    "biolink:splicing_affected_by",
    "biolink:splicing_decreased_by",
    "biolink:splicing_increased_by",
    "biolink:stability_affected_by",
    "biolink:stability_decreased_by",
    "biolink:stability_increased_by",
    "biolink:stage_qualifier",
    "biolink:start_coordinate",
    "biolink:start_interbase_coordinate",
    "biolink:stoichiometry",
    "biolink:strand",
    "biolink:subclass_of",
    "biolink:subject",
    "biolink:summary",
    "biolink:superclass_of",
    "biolink:supporting_data_source",
    "biolink:symbol",
    "biolink:synonym",
    "biolink:synthesis_decreased_by",
    "biolink:synthesis_increased_by",
    "biolink:systematic_synonym",
    "biolink:sythesis_affected_by",
    "biolink:temporally_related_to",
    "biolink:timepoint",
    "biolink:transcribed_from",
    "biolink:transcribed_to",
    "biolink:translates_to",
    "biolink:translation_of",
    "biolink:transport_affected_by",
    "biolink:transport_decreased_by",
    "biolink:transport_increased_by",
    "biolink:treated_by",
    "biolink:treats",
    "biolink:type",
    "biolink:update_date",
    "biolink:uptake_affected_by",
    "biolink:uptake_decreased_by",
    "biolink:uptake_increased_by",
    "biolink:variant_part_of",
    "biolink:version",
    "biolink:version_of",
    "biolink:volume",
    "biolink:xenologous_to",
    "biolink:xref",
    "https://w3id.org/linkml/Boolean",
    "https://w3id.org/linkml/ClassDefinition",
    "https://w3id.org/linkml/Date",
    "https://w3id.org/linkml/Datetime",
    "https://w3id.org/linkml/Decimal",
    "https://w3id.org/linkml/Double",
    "https://w3id.org/linkml/Float",
    "https://w3id.org/linkml/Integer",
    "https://w3id.org/linkml/Ncname",
    "https://w3id.org/linkml/Nodeidentifier",
    "https://w3id.org/linkml/Objectidentifier",
    "https://w3id.org/linkml/SlotDefinition",
    "https://w3id.org/linkml/String",
    "https://w3id.org/linkml/SubsetDefinition",
    "https://w3id.org/linkml/Time",
    "https://w3id.org/linkml/TypeDefinition",
    "https://w3id.org/linkml/Uri",
    "https://w3id.org/linkml/Uriorcurie",
    "https://w3id.org/linkml/agent_id",
    "https://w3id.org/linkml/agent_name",
    "https://w3id.org/linkml/anatomical_entity_to_anatomical_entity_association_object",
    "https://w3id.org/linkml/anatomical_entity_to_anatomical_entity_association_subject",
    "https://w3id.org/linkml/anatomical_entity_to_anatomical_entity_ontogenic_association_object",
    "https://w3id.org/linkml/anatomical_entity_to_anatomical_entity_ontogenic_association_predicate",
    "https://w3id.org/linkml/anatomical_entity_to_anatomical_entity_ontogenic_association_subject",
    "https://w3id.org/linkml/anatomical_entity_to_anatomical_entity_part_of_association_object",
    "https://w3id.org/linkml/anatomical_entity_to_anatomical_entity_part_of_association_predicate",
    "https://w3id.org/linkml/anatomical_entity_to_anatomical_entity_part_of_association_subject",
    "https://w3id.org/linkml/article_iso_abbreviation",
    "https://w3id.org/linkml/article_published_in",
    "https://w3id.org/linkml/association_category",
    "https://w3id.org/linkml/association_type",
    "https://w3id.org/linkml/attribute_name",
    "https://w3id.org/linkml/behavior_to_behavioral_feature_association_object",
    "https://w3id.org/linkml/behavior_to_behavioral_feature_association_subject",
    "https://w3id.org/linkml/book_chapter_published_in",
    "https://w3id.org/linkml/book_id",
    "https://w3id.org/linkml/book_type",
    "https://w3id.org/linkml/case_to_entity_association_mixin_subject",
    "https://w3id.org/linkml/cell_line_as_a_model_of_disease_association_subject",
    "https://w3id.org/linkml/cell_line_to_disease_or_phenotypic_feature_association_subject",
    "https://w3id.org/linkml/cell_line_to_entity_association_mixin_subject",
    "https://w3id.org/linkml/chemical_entity_to_entity_association_mixin_subject",
    "https://w3id.org/linkml/chemical_to_chemical_association_object",
    "https://w3id.org/linkml/chemical_to_chemical_derivation_association_catalyst_qualifier",
    "https://w3id.org/linkml/chemical_to_chemical_derivation_association_object",
    "https://w3id.org/linkml/chemical_to_chemical_derivation_association_predicate",
    "https://w3id.org/linkml/chemical_to_chemical_derivation_association_subject",
    "https://w3id.org/linkml/chemical_to_disease_or_phenotypic_feature_association_object",
    "https://w3id.org/linkml/chemical_to_entity_association_mixin_subject",
    "https://w3id.org/linkml/chemical_to_gene_association_object",
    "https://w3id.org/linkml/chemical_to_pathway_association_object",
    "https://w3id.org/linkml/clinical_finding_has_attribute",
    "https://w3id.org/linkml/clinical_measurement_has_attribute_type",
    "https://w3id.org/linkml/contributor_association_object",
    "https://w3id.org/linkml/contributor_association_predicate",
    "https://w3id.org/linkml/contributor_association_qualifiers",
    "https://w3id.org/linkml/contributor_association_subject",
    "https://w3id.org/linkml/disease_or_phenotypic_feature_to_entity_association_mixin_subject",
    "https://w3id.org/linkml/disease_or_phenotypic_feature_to_location_association_object",
    "https://w3id.org/linkml/disease_to_entity_association_mixin_subject",
    "https://w3id.org/linkml/drug_to_entity_association_mixin_subject",
    "https://w3id.org/linkml/drug_to_gene_association_object",
    "https://w3id.org/linkml/entity_to_disease_association_mixin_object",
    "https://w3id.org/linkml/entity_to_disease_or_phenotypic_feature_association_mixin_object",
    "https://w3id.org/linkml/entity_to_exposure_event_association_mixin_object",
    "https://w3id.org/linkml/entity_to_outcome_association_mixin_object",
    "https://w3id.org/linkml/entity_to_phenotypic_feature_association_mixin_description",
    "https://w3id.org/linkml/entity_to_phenotypic_feature_association_mixin_object",
    "https://w3id.org/linkml/exon_to_transcript_relationship_object",
    "https://w3id.org/linkml/exon_to_transcript_relationship_subject",
    "https://w3id.org/linkml/exposure_event_to_entity_association_mixin_subject",
    "https://w3id.org/linkml/exposure_event_to_phenotypic_feature_association_subject",
    "https://w3id.org/linkml/functional_association_object",
    "https://w3id.org/linkml/functional_association_subject",
    "https://w3id.org/linkml/gene_as_a_model_of_disease_association_subject",
    "https://w3id.org/linkml/gene_expression_mixin_quantifier_qualifier",
    "https://w3id.org/linkml/gene_has_variant_that_contributes_to_disease_association_subject",
    "https://w3id.org/linkml/gene_regulatory_relationship_object",
    "https://w3id.org/linkml/gene_regulatory_relationship_predicate",
    "https://w3id.org/linkml/gene_regulatory_relationship_subject",
    "https://w3id.org/linkml/gene_to_disease_association_subject",
    "https://w3id.org/linkml/gene_to_entity_association_mixin_subject",
    "https://w3id.org/linkml/gene_to_expression_site_association_object",
    "https://w3id.org/linkml/gene_to_expression_site_association_predicate",
    "https://w3id.org/linkml/gene_to_expression_site_association_quantifier_qualifier",
    "https://w3id.org/linkml/gene_to_expression_site_association_stage_qualifier",
    "https://w3id.org/linkml/gene_to_expression_site_association_subject",
    "https://w3id.org/linkml/gene_to_gene_association_object",
    "https://w3id.org/linkml/gene_to_gene_association_subject",
    "https://w3id.org/linkml/gene_to_gene_coexpression_association_predicate",
    "https://w3id.org/linkml/gene_to_gene_homology_association_predicate",
    "https://w3id.org/linkml/gene_to_gene_product_relationship_object",
    "https://w3id.org/linkml/gene_to_gene_product_relationship_predicate",
    "https://w3id.org/linkml/gene_to_gene_product_relationship_subject",
    "https://w3id.org/linkml/gene_to_go_term_association_object",
    "https://w3id.org/linkml/gene_to_go_term_association_subject",
    "https://w3id.org/linkml/gene_to_phenotypic_feature_association_subject",
    "https://w3id.org/linkml/genomic_sequence_localization_object",
    "https://w3id.org/linkml/genomic_sequence_localization_predicate",
    "https://w3id.org/linkml/genomic_sequence_localization_subject",
    "https://w3id.org/linkml/genotype_as_a_model_of_disease_association_subject",
    "https://w3id.org/linkml/genotype_to_disease_association_object",
    "https://w3id.org/linkml/genotype_to_disease_association_predicate",
    "https://w3id.org/linkml/genotype_to_disease_association_subject",
    "https://w3id.org/linkml/genotype_to_entity_association_mixin_subject",
    "https://w3id.org/linkml/genotype_to_gene_association_object",
    "https://w3id.org/linkml/genotype_to_gene_association_predicate",
    "https://w3id.org/linkml/genotype_to_gene_association_subject",
    "https://w3id.org/linkml/genotype_to_genotype_part_association_object",
    "https://w3id.org/linkml/genotype_to_genotype_part_association_predicate",
    "https://w3id.org/linkml/genotype_to_genotype_part_association_subject",
    "https://w3id.org/linkml/genotype_to_phenotypic_feature_association_predicate",
    "https://w3id.org/linkml/genotype_to_phenotypic_feature_association_subject",
    "https://w3id.org/linkml/genotype_to_variant_association_object",
    "https://w3id.org/linkml/genotype_to_variant_association_predicate",
    "https://w3id.org/linkml/genotype_to_variant_association_subject",
    "https://w3id.org/linkml/has_taxonomic_rank",
    "https://w3id.org/linkml/macromolecular_machine_mixin_name",
    "https://w3id.org/linkml/macromolecular_machine_to_biological_process_association_object",
    "https://w3id.org/linkml/macromolecular_machine_to_cellular_component_association_object",
    "https://w3id.org/linkml/macromolecular_machine_to_entity_association_mixin_subject",
    "https://w3id.org/linkml/macromolecular_machine_to_molecular_activity_association_object",
    "https://w3id.org/linkml/material_sample_derivation_association_object",
    "https://w3id.org/linkml/material_sample_derivation_association_predicate",
    "https://w3id.org/linkml/material_sample_derivation_association_subject",
    "https://w3id.org/linkml/material_sample_to_entity_association_mixin_subject",
    "https://w3id.org/linkml/mixin",
    "https://w3id.org/linkml/model_to_disease_association_mixin_predicate",
    "https://w3id.org/linkml/model_to_disease_association_mixin_subject",
    "https://w3id.org/linkml/molecular_activity_enabled_by",
    "https://w3id.org/linkml/molecular_activity_has_input",
    "https://w3id.org/linkml/molecular_activity_has_output",
    "https://w3id.org/linkml/named_thing_category",
    "https://w3id.org/linkml/named_thing_to_information_content_entity_association_object",
    "https://w3id.org/linkml/named_thing_to_information_content_entity_association_predicate",
    "https://w3id.org/linkml/named_thing_to_information_content_entity_association_subject",
    "https://w3id.org/linkml/organism_taxon_has_taxonomic_rank",
    "https://w3id.org/linkml/organism_taxon_subclass_of",
    "https://w3id.org/linkml/organism_taxon_to_entity_association_subject",
    "https://w3id.org/linkml/organism_taxon_to_environment_association_object",
    "https://w3id.org/linkml/organism_taxon_to_environment_association_predicate",
    "https://w3id.org/linkml/organism_taxon_to_environment_association_subject",
    "https://w3id.org/linkml/organism_taxon_to_organism_taxon_association_object",
    "https://w3id.org/linkml/organism_taxon_to_organism_taxon_association_subject",
    "https://w3id.org/linkml/organism_taxon_to_organism_taxon_interaction_associated_environmental_context",
    "https://w3id.org/linkml/organism_taxon_to_organism_taxon_interaction_object",
    "https://w3id.org/linkml/organism_taxon_to_organism_taxon_interaction_predicate",
    "https://w3id.org/linkml/organism_taxon_to_organism_taxon_interaction_subject",
    "https://w3id.org/linkml/organism_taxon_to_organism_taxon_specialization_object",
    "https://w3id.org/linkml/organism_taxon_to_organism_taxon_specialization_predicate",
    "https://w3id.org/linkml/organism_taxon_to_organism_taxon_specialization_subject",
    "https://w3id.org/linkml/organism_to_organism_association_object",
    "https://w3id.org/linkml/organism_to_organism_association_relation",
    "https://w3id.org/linkml/organism_to_organism_association_subject",
    "https://w3id.org/linkml/organismal_entity_as_a_model_of_disease_association_subject",
    "https://w3id.org/linkml/organismal_entity_has_attribute",
    "https://w3id.org/linkml/pairwise_gene_to_gene_interaction_predicate",
    "https://w3id.org/linkml/pairwise_gene_to_gene_interaction_relation",
    "https://w3id.org/linkml/pairwise_molecular_interaction_id",
    "https://w3id.org/linkml/pairwise_molecular_interaction_object",
    "https://w3id.org/linkml/pairwise_molecular_interaction_predicate",
    "https://w3id.org/linkml/pairwise_molecular_interaction_relation",
    "https://w3id.org/linkml/pairwise_molecular_interaction_subject",
    "https://w3id.org/linkml/population_to_population_association_object",
    "https://w3id.org/linkml/population_to_population_association_predicate",
    "https://w3id.org/linkml/population_to_population_association_subject",
    "https://w3id.org/linkml/publication_id",
    "https://w3id.org/linkml/publication_name",
    "https://w3id.org/linkml/publication_pages",
    "https://w3id.org/linkml/publication_type",
    "https://w3id.org/linkml/reaction_to_catalyst_association_object",
    "https://w3id.org/linkml/reaction_to_participant_association_subject",
    "https://w3id.org/linkml/sequence_feature_relationship_object",
    "https://w3id.org/linkml/sequence_feature_relationship_subject",
    "https://w3id.org/linkml/sequence_variant_has_biological_sequence",
    "https://w3id.org/linkml/sequence_variant_has_gene",
    "https://w3id.org/linkml/sequence_variant_id",
    "https://w3id.org/linkml/sequence_variant_modulates_treatment_association_object",
    "https://w3id.org/linkml/sequence_variant_modulates_treatment_association_subject",
    "https://w3id.org/linkml/serial_id",
    "https://w3id.org/linkml/serial_type",
    "https://w3id.org/linkml/small_molecule_id",
    "https://w3id.org/linkml/socioeconomic_exposure_has_attribute",
    "https://w3id.org/linkml/taxon_to_taxon_association_object",
    "https://w3id.org/linkml/taxon_to_taxon_association_relation",
    "https://w3id.org/linkml/taxon_to_taxon_association_subject",
    "https://w3id.org/linkml/topValue",
    "https://w3id.org/linkml/transcript_to_gene_relationship_object",
    "https://w3id.org/linkml/transcript_to_gene_relationship_subject",
    "https://w3id.org/linkml/variant_as_a_model_of_disease_association_subject",
    "https://w3id.org/linkml/variant_to_disease_association_object",
    "https://w3id.org/linkml/variant_to_disease_association_predicate",
    "https://w3id.org/linkml/variant_to_disease_association_subject",
    "https://w3id.org/linkml/variant_to_entity_association_mixin_subject",
    "https://w3id.org/linkml/variant_to_gene_association_object",
    "https://w3id.org/linkml/variant_to_gene_association_predicate",
    "https://w3id.org/linkml/variant_to_gene_expression_association_predicate",
    "https://w3id.org/linkml/variant_to_phenotypic_feature_association_subject",
    "https://w3id.org/linkml/variant_to_population_association_has_count",
    "https://w3id.org/linkml/variant_to_population_association_has_quotient",
    "https://w3id.org/linkml/variant_to_population_association_has_total",
    "https://w3id.org/linkml/variant_to_population_association_object",
    "https://w3id.org/linkml/variant_to_population_association_subject"
]

This is important because, currently, the kg2_util validator checks if a category is in biolink_categories_ontology_depths OR in biolink_ont.nodes(). Thus, mixins like biolink:OntologyClass are getting past the validator.

ecwood commented 3 years ago

Linking #70 here, which relates to Biolink 2.0 and will increase the complexity of this upgrade.

ecwood commented 3 years ago

Here's the category tree for Biolink 2.1.0:

{
    "named thing": [
        "activity",
        "administrative entity",
        "biological entity",
        "chemical entity",
        "clinical entity",
        "device",
        "event",
        "information content entity",
        "organism taxon",
        "phenomenon",
        "physical entity",
        "planetary entity",
        "procedure",
        "treatment"
    ],
    "chemical entity": [
        "chemical mixture",
        "environmental food contaminant",
        "food additive",
        "molecular entity",
        "nutrient"
    ],
    "clinical entity": [
        "clinical intervention",
        "clinical trial"
    ],
    "biological entity": [
        "biological process or activity",
        "disease or phenotypic feature",
        "gene",
        "gene family",
        "genome",
        "genotype",
        "haplotype",
        "organismal entity",
        "polypeptide",
        "reagent targeted gene",
        "sequence variant"
    ],
    "administrative entity": [
        "agent"
    ],
    "planetary entity": [
        "environmental feature",
        "environmental process",
        "geographic location"
    ],
    "information content entity": [
        "confidence level",
        "dataset",
        "dataset distribution",
        "dataset summary",
        "dataset version",
        "evidence type",
        "information resource",
        "publication"
    ],
    "physical entity": [
        "material sample"
    ],
    "molecular entity": [
        "nucleic acid entity",
        "small molecule"
    ],
    "nutrient": [
        "macronutrient",
        "micronutrient"
    ],
    "chemical mixture": [
        "complex molecular mixture",
        "food",
        "molecular mixture",
        "processed material"
    ],
    "clinical intervention": [
        "hospitalization"
    ],
    "biological process or activity": [
        "biological process",
        "molecular activity"
    ],
    "organismal entity": [
        "anatomical entity",
        "cell line",
        "individual organism",
        "life stage",
        "population of individual organisms"
    ],
    "polypeptide": [
        "protein"
    ],
    "disease or phenotypic feature": [
        "disease",
        "phenotypic feature"
    ],
    "sequence variant": [
        "snv"
    ],
    "geographic location": [
        "geographic location at time"
    ],
    "publication": [
        "article",
        "book",
        "book chapter",
        "serial"
    ],
    "nucleic acid entity": [
        "coding sequence",
        "exon",
        "transcript"
    ],
    "micronutrient": [
        "vitamin"
    ],
    "molecular mixture": [
        "drug"
    ],
    "biological process": [
        "behavior",
        "death",
        "pathological process",
        "pathway",
        "physiological process"
    ],
    "individual organism": [
        "case"
    ],
    "anatomical entity": [
        "cell",
        "cellular component",
        "gross anatomical structure",
        "pathological anatomical structure"
    ],
    "population of individual organisms": [
        "study population"
    ],
    "protein": [
        "protein isoform"
    ],
    "phenotypic feature": [
        "behavioral feature",
        "clinical finding"
    ],
    "transcript": [
        "RNA product"
    ],
    "study population": [
        "cohort"
    ],
    "RNA product": [
        "RNA product isoform",
        "noncoding RNA product"
    ],
    "noncoding RNA product": [
        "microRNA",
        "siRNA"
    ]
}

This shouldn't require more work on our part (no categories were added or removed), but may drastically change the category assignments.

saramsey commented 3 years ago

How about adding these lines to curies-to-categories.yaml?

    FMA:63887: biological entity
    FMA:67257: protein
ecwood commented 3 years ago

How about adding these lines to curies-to-categories.yaml?

    FMA:63887: biological entity
    FMA:67257: protein

These were added in commit a6b38aa. Also, for documentation purposes, this was discovered through https://github.com/RTXteam/RTX/issues/1557#issuecomment-875968946

ecwood commented 3 years ago

Note, all nodes with these biolink categories are being re-categorized to biolink:NamedThing due to them no longer being supported biolink categories:

These escaped the validator previously, due to a bug in the validator that essentially checked to see if a category is in the biolink model (ANYWHERE, including as a mixin or attribute).

ecwood commented 3 years ago

Now that 48b949a has been brought into the main branch, I'm going to close this issue out.