RTXteam / RTX-KG2

Build system for the RTX-KG2 biomedical knowledge graph, part of the ARAX reasoning system (https://github.com/RTXTeam/RTX)
MIT License
39 stars 8 forks source link

Update to Biolink v3.5.0 #302

Open ecwood opened 1 year ago

ecwood commented 1 year ago

Based on https://github.com/RTXteam/RTX-KG2/issues/301#issuecomment-1610336171, we need to update to Biolink v3.5.0. Currently, we are on v3.1.2. Since we are so out of date, I am creating this issue to document some past changes that might be sticky for us.

For example, in Biolink v3.2.7, a lot changed with KEGG IRIs.

Unfortunately, Biolink v3.5.0 doesn't exist yet to start working with.

ecwood commented 1 year ago

From @saramsey:

Oh yes, so on a related note, but motivated by a different issue, we will need to start developing a file of Biolink predicates that have been deprecated since the Biolink version that xDTD is currently based on (will need to ask Chunyu), and to curate which Biolink predicates (and qualifiers, if necessary) those map to. That will require some creativitiy.

ecwood commented 1 year ago

v3.2.0: I don't see anything we have to change.

v3.2.1: There's a change with KGX, but I don't think this impacts us.

v3.2.2:

v3.2.3: This doesn't impact us.

v3.2.4: This shouldn't matter to us (slot_usage information about retrieval source). Should we be using retrieval source as the category for our upstream sources, rather than information content entity? What is the difference between this and information resource?

v3.2.5: This doesn't impact us.

v3.2.6: This doesn't impact us.

v3.2.7:

v3.2.8:

v3.3.0:

v3.3.1:

v3.3.2: This does not impact us.

v3.3.3: This shouldn't impact us. It looks like internal checking of the biolink model, but if they found any bad URLs, we might have to change ours.

v3.3.4:

v3.4.0:

v3.4.1: This release doesn't impact us. In fact, it looks like within the model, it went from 3.4.0 to 3.4.2 (https://github.com/biolink/biolink-model/pull/1327/files#diff-e84866df1772b9b92474ba97eeacbf344265d3285db33b3459f4e794d1de24c5L28-R28)

v3.4.2:

v3.4.3:

saramsey commented 1 year ago

@ecwood Thank you! Any thoughts about the 3.3.X and 3.4.X releases?

ecwood commented 1 year ago

@ecwood Thank you! Any thoughts about the 3.3.X and 3.4.X releases?

@saramsey Yes! I updated the comment with information about the other releases.

saramsey commented 1 year ago

Thanks so much. I used the master branch of biolink-model.yaml to guide the changes that I made to predicate-remap.yaml for #305. Hopefully that helps move us toward 3.5.0 compliance.

ecwood commented 1 year ago

I expected that we'd need to change the UniProt URL. However, run-validation-tests.sh on the master branch is not failing.

@saramsey Do you know why it (specifically validate_curies_to_urls_map_yaml.py) wouldn't fail despite these lines being different?

https://github.com/RTXteam/RTX-KG2/blob/9aa895416f9ff7abbae5fb699869244ea908a2c0/curies-to-urls-map.yaml#L716-L717

https://github.com/biolink/biolink-model/blob/5b9c2834e6ae548f65f1819fb09e390e7aa3f307/biolink-model.yaml#L143:

  UniProtKB: 'http://purl.uniprot.org/uniprot/'

There's a similar situation with KEGG:

https://github.com/RTXteam/RTX-KG2/blob/9aa895416f9ff7abbae5fb699869244ea908a2c0/curies-to-urls-map.yaml#L246-L257

https://github.com/biolink/biolink-model/blob/5b9c2834e6ae548f65f1819fb09e390e7aa3f307/biolink-model.yaml#L87-L91:

  KEGG.BRITE: 'https://bioregistry.io/kegg.brite:'
  KEGG: 'http://www.kegg.jp/entry/'
  KEGG.GENES: 'https://bioregistry.io/kegg.genes:bsu:'
  KEGG.PATHWAY: 'https://bioregistry.io/kegg.pathway:'
  KEGG.RCLASS: 'https://www.genome.jp/dbget-bin/www_bget?rc:'
acevedol commented 1 year ago

Strange that the validation tests don't mind the differences in the URLs. I don't know why that would be. Just to remind myself because I've been out so much, in addition to the changes above to catch up to the current biolink model, we are also addressing issue #281, adding a domain_range_exclusion boolean type property to edges.

@ecwood , do you have a preference on where I get started on the changes?

ecwood commented 1 year ago

@acevedol Some of the changes have already been made (#306), and all of that has been done in kg2.8.4-prep, so let's stick with that.

we are also addressing issue https://github.com/RTXteam/RTX-KG2/issues/281, adding a domain_range_exclusion boolean type property to edges

Yes, but unfortunately, we don't have the schema for how we will get this data yet, so it's not much use to start the code for it yet.

ecwood commented 1 year ago

While testing out bc079cf (to ensure the validation scripts are using the correct version of biolink), we got this error:

Reading ontology JSON file: /home/ubuntu/kg2-build/biolink-model.owl.json; size: 2213.82 KiB
Traceback (most recent call last):
  File "/home/ubuntu/kg2-code/validate_kg2_util_curies_urls_categories.py", line 62, in <module>
    assert category_curie in biolink_categories_ontology_depths, category_curie
AssertionError: biolink:InformationResource

It looks like information resource was removed in Biolink v3.3.4. See https://github.com/biolink/biolink-model/commit/c24d4433b82ab83372824f31e5fd543d670fa237 and https://github.com/biolink/biolink-model/commit/e2207e18ee8e5a0d7fa096977858f2d22cdb3577.

ecwood commented 1 year ago

With 8a63bbc, the validator is not failing anymore. That being said, there are still hundreds (possibly thousands) of errors like this:

2023-07-05 20:34:47,917 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/enabled_by> "prevented by"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,920 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/prevents> "predisposes"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,922 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/decreases_response_to> "increases response to"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,923 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/broad_match> "narrow match"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,923 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/enables> "prevents"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,924 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/has_increased_amount> "has decreased amount"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,924 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/has_output> "has input"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,924 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/predisposes> "prevents"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,925 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/increases_response_to> "decreases response to"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,925 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/contraindicated_for> "treats"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,926 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/has_input> "has output"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
ecwood commented 1 year ago

Per @saramsey, validate_curies_to_urls_map_yaml.py is validating CURIEs only, not the URLs. In an enhancement, we should have the script compare the URLs to what is in Biolink.

ecwood commented 1 year ago

Now that #319 and #320 have been finished, tested, and are passing, do we need to do anything else to verify that we are Biolink 3.5.0 compliant? Do we need to do #306 to verify compliance?