mapping-commons / sssom-py

Python toolkit for SSSOM mapping format
https://mapping-commons.github.io/sssom-py/index.html#
MIT License
49 stars 12 forks source link

Bug?: Obographs parser: metadata not utilized; getting 0 mappings #456

Closed joeflack4 closed 11 months ago

joeflack4 commented 11 months ago

Overview

I'm working on a PR to add FHIR ConceptMaps to the TIMS OMOP/OWL to FHIR conversion tooling.

I am stuck on an issue where when I convert OMOP -> OWL -> Obographs -> SSSOM TSV, the TSV does not contain any mappings.

Command & main inputs / outputs of concern

Command: f'sssom parse {obograph_path} -I obographs-json -o {outpath_sssom} -m {metadata_path}' (sssom parse /Users/joeflack4/projects/owl-on-fhir/test/output/test_defaults/RxNorm.owl.obographs.json -I obographs-json -o /Users/joeflack4/projects/owl-on-fhir/test/output/test_defaults/RxNorm.sssom.tsv -m /Users/joeflack4/projects/owl-on-fhir/test/output/test_defaults/temp-metadata.sssom.yml)

Inputs:

  1. Obographs JSON: RxNorm.owl.obographs.json
  2. SSSOM Metadata YML: temp-metadata.sssom.yml
    curie_map:
    OMOP: https://athena.ohdsi.org/search-terms/terms/
    omoprel: https://w3id.org/cpont/omop/relations/

Output: RxNorm.sssom.tsv

Discussion

RxNorm.owl is a subset of the OMOP content. It's only 472 lines.

Includes namespaces:

     xmlns:OMOP="https://athena.ohdsi.org/search-terms/terms/"
     xmlns:omoprel="https://w3id.org/cpont/omop/relations/">
Example class snippet (truncated)

```owl hepatitis B immune globulin 200 UNT/ML Injectable Solution Clinical Drug 412674 Drug S 2099-12-31 1970-01-01 RxNorm OMOP:41148125 OMOP:19110172 OMOP:2026202 OMOP:2035461 OMOP:35139038 OMOP:501346 OMOP:501346 OMOP:19082103 OMOP:36223599 ```

I expect / want every object property declaration from the second half of the snippet to be a single row in my SSSOM TSV.

When I convert to Obographs, it looks like this:

Obographs snippet

```json }, { "id" : "https://athena.ohdsi.org/search-terms/terms/501346", "lbl" : "hepatitis B immune globulin 200 UNT/ML Injectable Solution", "type" : "CLASS", "meta" : { "basicPropertyValues" : [ { "pred" : "https://athena.ohdsi.org/search-terms/terms/concept_class_id", "val" : "Clinical Drug" }, { "pred" : "https://athena.ohdsi.org/search-terms/terms/concept_code", "val" : "412674" }, { "pred" : "https://athena.ohdsi.org/search-terms/terms/domain_id", "val" : "Drug" }, { "pred" : "https://athena.ohdsi.org/search-terms/terms/standard_concept", "val" : "S" }, { "pred" : "https://athena.ohdsi.org/search-terms/terms/valid_end_date", "val" : "2099-12-31" }, { "pred" : "https://athena.ohdsi.org/search-terms/terms/valid_start_date", "val" : "1970-01-01" }, { "pred" : "https://athena.ohdsi.org/search-terms/terms/vocabulary_id", "val" : "RxNorm" }, { "pred" : "https://w3id.org/cpont/omop/relations/Available_as_box", "val" : "OMOP:41148125" }, { "pred" : "https://w3id.org/cpont/omop/relations/Consists_of", "val" : "OMOP:19110172" }, { "pred" : "https://w3id.org/cpont/omop/relations/Has_marketed_form", "val" : "https://athena.ohdsi.org/search-terms/terms/2026202" }, { "pred" : "https://w3id.org/cpont/omop/relations/Has_marketed_form", "val" : "OMOP:2026526" }, { "pred" : "https://w3id.org/cpont/omop/relations/Has_quantified_form", "val" : "OMOP:42919946" }, { "pred" : "https://w3id.org/cpont/omop/relations/Has_tradename", "val" : "OMOP:2035461" "pred" : "https://w3id.org/cpont/omop/relations/Mapped_from", "val" : "OMOP:501346" }, { "pred" : "https://w3id.org/cpont/omop/relations/Maps_to", "val" : "OMOP:501346" }, { "pred" : "https://w3id.org/cpont/omop/relations/RxNorm_has_dose_form", "val" : "OMOP:19082103" }, { "pred" : "https://w3id.org/cpont/omop/relations/RxNorm_is_a", "val" : "OMOP:36223599" } ] } }, { ```

In that snippet, you can see some of the relationship mappings:

          "pred" : "https://w3id.org/cpont/omop/relations/Has_marketed_form",
          "val" : "https://athena.ohdsi.org/search-terms/terms/2026202"

          "pred" : "https://w3id.org/cpont/omop/relations/Has_marketed_form",
          "val" : "OMOP:2026526"

In the top of these 2 mappings, I manually edited the JSON to replace the CURIE with a URI to test if that would work, but it didn't.

These namespaces exist in my metadata YML, so why aren't any mappings being generated?

Additional info

For reference, I looked at how OMIM is being converted to Obographs and SSSOM to double check what I was doing was correct. The [PR](https://github.com/HOT-Ecosystem/owl-on-fhir/pull/7) has a test case for this, FYI.

matentzn commented 11 months ago

One terminological thing here: a mapping is not a relationship. Its a correspondence between two terms. The lines between the two are very hazy in practice, but in our minds we should draw them, because otherwise sssom becomes another ontology formalism (basically able to publish all possible triples). This must be avoided.

In your case, did you try to use the https://mapping-commons.github.io/sssom-py/cli_usage.html#sssom-parse method with the --F/--filter-mapping-predicates option?

joeflack4 commented 11 months ago

Great point; this is another case where I missed the forest for the trees. It all makes perfect sense now. Should've been totally obvious to check the docs.

-F / ----mapping-predicate-filter is working for me. Gonna close this issue, but I'm wondering about opening 1-3 more issues, what do you think?:

New issues

1. Docs about the nature of mappings?

I glanced through the SSSOM docs but I didn't see anything about what a mapping is and isn't. Im' guessing it has something to do with the concept of "degree of equivalence / shared properties". Is this in scope for the docs? I'm not sure.

Could also include short list of preds SSSOM considers by default (I'm guessing something like skos exact/broad/narrow/close/related & oio hasDbXref).

2. Bug?: --mapping-predicate-filter only accepting URI, not CURIE

I tried --mapping-predicate-filter omoprel:Mapped_from (using the metadata YML posted in the OP) and it did not work, but --mapping-predicate-filter https://w3id.org/cpont/omop/relations/Mapped_from did.

3. Which OMOP relationships are mappings?

Not really scope for SSSOM, at least not at this point. Maybe for mapping-commons, or some of the OMOP or FHIR working groups. Not sure who I ought to engage with about this, but I can start with Davera Gabriel. OMOP has ~450 different relationships. So the task would be to go through and determine which are mapping relationships.

matentzn commented 11 months ago

You should turn 1) into issue on sssom repo, 2) into issue on sssom-py (both very good requests) and (3) could be made into an "example paragraph of (1)

joeflack4 commented 11 months ago

Done!