RTXteam / RTX-KG2

Build system for the RTX-KG2 biomedical knowledge graph, part of the ARAX reasoning system (https://github.com/RTXTeam/RTX)
MIT License
34 stars 9 forks source link

multi-item resource_id #318

Open edeutsch opened 12 months ago

edeutsch commented 12 months ago

The following ARAX result: https://arax.ncats.io/devED/api/arax/v1.4/response/dc05d73c-e55f-421c-b6d6-2aa417397537 is producing the following edge from KG2:

...
          "object": "MESH:D004852",
          "predicate": "biolink:subclass_of",
          "qualifiers": null,
          "sources": [
            {
              "resource_id": "infores:chebi; infores:genepio; infores:go-plus",
              "resource_role": "primary_knowledge_source",
              "source_record_urls": null,
              "upstream_resource_ids": null
            },
            {
              "resource_id": "infores:rtx-kg2",
              "resource_role": "aggregator_knowledge_source",
              "source_record_urls": null,
              "upstream_resource_ids": [
                "infores:chebi; infores:genepio; infores:go-plus"
              ]
            },
            {
              "resource_id": "infores:arax",
              "resource_role": "aggregator_knowledge_source",
              "source_record_urls": null,
              "upstream_resource_ids": [
                "infores:rtx-kg2"
              ]
            }
          ],
          "subject": "PUBCHEM.COMPOUND:446987"
        },

The validator is unhappy because resource_id is not a CURIE. Making it a string that is 3 CURIEs separated by "; " is creative, but not what the designers intended and not what the validator accepts.

What's the story here? And what's the solution?

ecwood commented 12 months ago

Here's this from KG2.8.3c (kg2canonicalized.rtx.ai): image

The edges in question are:

CHEBI:28915---rdfs:subClassOf---None---None---None---CHEBI:32955---OBO:chebi.owl
CHEBI:28915---rdfs:subClassOf---None---None---None---CHEBI:32955---OBO:go/extensions/go-plus.owl
CHEBI:28915---rdfs:subClassOf---None---None---None---CHEBI:32955---OBO:genepio.owl
ecwood commented 12 months ago

In KG2.8.0c (kg2canonicalized2.rtx.ai), this was an issue: image

{
  "predicate": "biolink:subclass_of",
  "primary_knowledge_source": "infores:chebi; infores:genepio; infores:go-plus",
  "publications_info": "{}",
  "kg2_ids": [
    "CHEBI:28915---rdfs:subClassOf---None---None---None---CHEBI:32955---OBO:chebi.owl"
  ],
  "subject": "PUBCHEM.COMPOUND:446987",
  "id": "9271251",
  "object": "MESH:D004852"
}

It seems like this will be fixed when KG2.8.3 is rolled out to production.

ecwood commented 12 months ago

I'm marking this for verification, so that we can remember to check this query when KG2.8.3 is rolled out shortly.