Open matentzn opened 7 months ago
We will probably have to https://github.com/mapping-commons/sssom/issues/225
The problem we might run into with that is that, as far as I know (and as I have noted in the discussion about the extension slots), LinkML does not have a map
type. We’d want to declare a field that could be used like this:
"curie_map": {
"FBbt": "http://purl.obolibrary.org/obo/FBbt_"
}
but unless I missed something in LinkML’s docs, this is not possible. All we can do is to have a list (i.e. a “multi-valued” field) of custom “dictionary entry“ types, like this:
"curie_map": [
{ "key": "Fbbt",
"value": "http://purl.obolibrary.org/obo/FBbt_" }
]
which of course would work but would be… weird, at the very least.
My own solution (that nobody will like, I know) to that is simple: decide that CURIEfied identifiers are only for the TSV format (which is what the spec currently says, incidentally), JSON should only contain full-length identifiers. No CURIE map needed, problem solved.
This PR adds methods
Which are exactly analogous to what was there before for JSON.
But its actual purpose is not so much to add those methods, but to carefully review the format (to make sure we are happy) so we can start making headway on https://github.com/mapping-commons/sssom/issues/321.
Breaking changes
json
parameter now refers tojson
, but used to refer tojsonld
. So anyone expectingjsonld
will now be served withjson
.JSON Format
We need to make sure that the JSON format looks exactly as we envision it. Problems I see so far
curie_map
. We will probably have to https://github.com/mapping-commons/sssom/issues/225Here is an example JSON file
``` { "mapping_set_id": "https://w3id.org/sssom/mapping/tests/data/basic.tsv", "license": "https://creativecommons.org/publicdomain/zero/1.0/", "mappings": [ { "subject_id": "a:something", "predicate_id": "rdfs:subClassOf", "object_id": "b:something", "mapping_justification": "semapv:LexicalMatching", "subject_label": "XXXXX", "subject_category": "biolink:AnatomicalEntity", "object_label": "xxxxxx", "object_category": "biolink:AnatomicalEntity", "subject_source": "a:example", "object_source": "b:example", "mapping_tool": "rdf_matcher", "confidence": 0.8, "subject_match_field": [ "rdfs:label" ], "object_match_field": [ "rdfs:label" ], "match_string": [ "xxxxx" ], "comment": "mock data" }, { "subject_id": "a:something", "predicate_id": "owl:equivalentClass", "object_id": "c:something", "mapping_justification": "semapv:LexicalMatching", "subject_label": "XYXYX", "subject_category": "biolink:AnatomicalEntity", "object_label": "xyxyxy", "object_category": "biolink:AnatomicalEntity", "subject_source": "a:example", "object_source": "c:example", "mapping_tool": "rdf_matcher", "confidence": 0.83, "subject_match_field": [ "rdfs:label" ], "object_match_field": [ "rdfs:label" ], "match_string": [ "xxxxx" ], "comment": "mock data" } ], "creator_id": [ "orcid:1234", "orcid:5678" ], "mapping_tool": "https://github.com/cmungall/rdf_matcher", "mapping_date": "2020-05-30" } ```The two remaining errors are also exactly due to this problem: