mapping-commons / sssom

Simple Standard for Sharing Ontology Mappings
https://mapping-commons.github.io/sssom/
BSD 3-Clause "New" or "Revised" License
144 stars 24 forks source link

Mapping involving post-coordinated subjects or objects #108

Open tudorache opened 2 years ago

tudorache commented 2 years ago

Thank you for the SSSOM initiative!

I think I know the answer, but I would like to get your opinion on this topic: Is there a way to express in SSSOM the fact that one or both of subject/object are post-coordinated entities?

For example, in ontology one, there is a class: "Mild diabetic retinopathy" which should be mapped in ontology two to a post-coordinated entity: "Diabetic retinopathy and severity some Mild".

My guess is that this scenario is not supported in SSSOM. Do you envision some extensions to SSSOM in the future that would support post-coordinated entities, or what would you recommend as alternative mapping language for this case?

My strong preference would be to use SSSOM if there is a way to express these type of more complex mappings. Thank you!

matentzn commented 2 years ago

Hey @tudorache

We have been debating about it here:

I have tried to push this through in our inaugural workshop, but we had some strong resistance against introducing this level of complexity into SSSOM. We have agreed during the workshop to wait for a very strong use case and then putting the proposal up for vote.

Technically, it is not that difficult to realise this, but the feeling was that if we start doing that, it would lead to "abuse" of the standard for essentially encoding entire ontologies. But please raise the issue up again. I can also share the paper draft we are about to submit if you are interested.

tudorache commented 2 years ago

Thank you, @matentzn!

The use case I am working on is the mapping of ICD-11 to other terminologies/ontologies, for example to prior versions of ICD (which do not support post-coordination), and to potential other targets (other WHO classifications, SNOMED CT, MESH, etc.).

As a background, ICD-11 has a Foundational Component, which has an OWL representation. Postcoordination is one of the new features of ICD-11 (see here and here). Therefore, it is important to be able to represent "complex" mappings that would be able to link a postcoordinated entity in ICD-11 to a simple or complex entity in a target ontology.

I think the suggestion in #36 of using subject/object patterns or templates would work.

The postcoordinated entities in ICD-11 follow a clear pattern (which I suppose is also the case for other ontologies): some_superclass and (axis_1 some filler_1) and (axis_2 some filler_2) and ...

I guess in an ideal world, SSSOM should remain as simple as possible, and have these complex mappings as optional adds-on, with tools that can unambiguously interpret them. My feeling is that there will be other use cases in need of complex maps (as noted in the previous comment), and rather than have different groups come up with arbitrary workarounds, it would be great to have a uniform and unambiguous way of representing and interpreting them.

But, I do understand the reticence of adding complexity to an intentionally-simple representation.

@matentzn, I would be very interested in reading the paper draft, if you can share it. Thanks!

matentzn commented 2 years ago

I am very happy you brought this up again. I cant promise a fast turnaround on the implementation, but at least we should consider adding the relevant elements to the spec to allow this kind of expression.

The main issue right now is to change the subject_id and object_id fields to multivalued.

@cmungall - do you have a position on this?

cmungall commented 2 years ago

Sorry, I am not following the motivation for changing sub/obj to multivalued

The first thing to decide is, if we support mapping class expressions, what profile of OWL (or a more expressive logic) we support. While I agree with @tudorache that obo-format/GO annotation extension-style genus-different expressions with no nesting is sufficient for most use cases I'm aware of, if we hardcode a solution for that profile, it's guaranteed someone could come along later and want unions/nesting/QCRs/complementOf

I would also caution against assuming OWL is the right fit for all complex mapping issues. Many rules have a closed-world flavor, e.g. ICD

I think the broad approaches I can think of would be

  1. declare out of scope, but provide sssom-esque best practice to do this in OWL or other expressive formalisms
  2. Allow class expressions (Manchester) as subject_id/object_id
  3. Support EL-only and add subject/object differentia fields
    • 1a: expressions that must be parsed (see appendix of https://pubmed.ncbi.nlm.nih.gov/24885854/ for how this might work) (requires parsing, a bit icky)
    • 1b: multivalued fields {subj,obj}_{rel,filler}, with order-matching (a bit icky). Also prohibits nesting
      1. Allow blank nodes or some kind of local identifier or expression hashing in subject/object and either/both:
    • have a convention for a supplementary owl file with equivalence axioms
    • extend the header format to include equivalence axioms there, e.g. equiv: owlhex:<MD5hashOfExpr> "G and R some D"
  4. some kind of hybrid. E.g. have subject_id and object_id be the genus of the expressions, use a conservative predicate like closeMatch that is not wrong for the simple SPO interpretation, and then have an optional field that is a complete OWL axiom (including GCIs)
  5. use the singleton predicate pattern

I list 1 for completeness but I think we would all vote against this. Same for 5

2 has the usual problems that we already hashed with negation, the differentia fields are non-ignorable. and it only supports obo format profile

I like some parts of 3. It's backwards and forwards compatible, in fact there is nothing to stop me doing this right now with external OWL files that include the axioms. It is composable in an IMHO elegant way - SSSOM remains a simple format for mappings between named entities, we use OWL (or other formalisms) externally to map named entities to expressions.

For 3 it would be nice to have conventions for hashing expressions, maybe even register a bioregistry entry for this backed by some kind of distributed hash table

it's not clear what the overall benefit is though if someone needs the composition to be able to meaningfully interpret the files, in which case we're back to 0 or 2.

matentzn commented 2 years ago

Alright, after careful considerations of pros and cons of the issue, we propose the following:

  1. The main SSSOM datamodel stays simple without support for complex expressions in the subject and object fields. Chris' owlhex proposal above is still available, but this is IMO very... non-beautiful. It may grow on my though, but hopefully not too much.
  2. We abstract the SSSOM datamodel and then provide profiles. The above would be the "simple profile", the default. We would develop a complex profile, which could allow for additional modelling:
    • Complex class expressions modelled using:
      • Multivalued fields for subject, object ids
      • template fields to describe the associated templates
      • template formalism fields to tell tools how to interpret the templates
    • More deeply nested evidence structure, like multiple reviewers for different parts of the mapping process, multiple pieces of evidence like multiple match_fields and others (requires breaking the flat requirement).

We will work with @tudorache to ensure that her requirements are met to that end. I personally will be on extended leave until end of January (from this week on), but after that we can set this process in motion.

For now, I would suggest @tudorache you just collect the complex expressions you want to map, and we will work out together how exactly the "complex profile" will look like.

I hope this makes sense!

samsontu commented 2 years ago

Hi, I'd like to add to @tudorache's request for post-coordination mappings. Some people working with WHO are trying to harmonize terms in WHO classifications (ICD, ICF [for functioning and disability], and ICHI [for health interventions]). For example there are signs and symptoms terms in ICD that are equivalent to ICF body functions and some impairment qualifiers. There are also people working on mappings of various national intervention classifications to ICHI, that need to use conjunction, disjunction, and qualifier post-coordination. I can solicit examples these mapping requirements.

matentzn commented 2 years ago

@samsontu thanks, some more concrete examples would be great. When you say "conjunction, disjunction, and qualifier post-coordination", do you explicitly mean those in the OWL sense? Or do you mean in a more abstract fashion, without any particular logical formalism in mind (i.e. some use cases would create some Common Logic, others OWL, others some other First Order Logic output), or can we assume that when you do post-composition, you are always talking in terms of OWL 2 class expressions?

matentzn commented 2 years ago

@cmungall just ran an idea by me which sounds extremely crazy at first, but would allow for a clear separation of concerns. I am not saying we go there, it sounds a bit crazy, but it allows us to 1) keep SSSOM simple and 2) offer maximal flexibility to the mapping process.

The idea is this.

  1. We add three fields to sssom: template_system, subject_id_template_data and subject_id_template.
  2. In the primary mapping set, we use the same ids as the identifier field in the subject_id_template.
  3. We recommend to version the three files (template, template_data, sssom mapping file) together, but we can also keep them apart.

There is some risk of the connection between the SSSOM mapping filed and the two template files breaking, but we can require to use versioned PURLs for this field to control the issue.

Advantage of this solution:

Disadvantages:

I think I can get behind a solution like that. Let me know what you all think!

cthoyt commented 2 years ago

Can you reformulate this so all of the extra metadata required to do this kind of transformation lives in a secondary configuration file? Like would it be possible make a standardized way of doing this that doesn't touch the SSSOM standard at all?

matentzn commented 2 years ago

Less then these three simple mapping set level elements? You can make a proposal but I cannot think of any way that gives at least some kind of integrity to mapping - template connection..

What is your concern?

graybeal commented 2 years ago

I like the innovation, but I have a concern that may be the same @cthoyt is getting at.

It's possible I've missed something along the way, but to say "System X supports the SSSOM standard", we need the functionality of the SSSOM standard to be clearly defined, understood, and easily implemented. (And my assumption is that the standard is defining a data file format, not a set of supported operational capabilities.)

While I was thinking of it as a table of triples with some prefixes in front of it, easily converted to RDF, I was very confident that BioPortal could take that information and convert it to BioPortal mappings. With each complexity that might get added (additional columns, annotational specifications, indirect automation of information construction), I'm less sure about what is involved. If/when I have a few hours I'll be able to go through the whole thing in detail and maybe it's still straightforward. But in this case, it's clear that supporting the specification would require implementing tooling to recognize and apply the transformations on the fly. Far simpler (for a 'data file standard') would be if all those transformations happened in order to create the SSSOM-compatible file, rather than as a step in interpreting it.

matentzn commented 2 years ago

Unfortunately we are at this situation:

  1. We urgently need a way to support complex mappings (composed subjects)
  2. We urgently do not want to complicate the SSSOM metadata standard

Both are diametrically opposed, and if we cannot agree on a standard way to do it, I will just promote the idea here as a non-standard way to deal with it (because I and my stakeholders need it) - outside the SSSOM standard. This means ad-hoc solutions for representing complex subjects will emerge, which may be fine, but may create difficulties later on.

Rather than opposing any solution for complex subjects (we do need one), I want to encourage you (@cthoyt and @graybeal) to present concrete concerns which can be addressed.. I am fine to package the proposal here into an SSSOMC complex extension if I have to, but we need to really weigh the complexity of maintaining two models against the perceived benefits. Right now the proposal does

  1. Have zero effect on the rest of the standard
  2. Introduces three simple optional elements which a client infrastructure may choose to ignore without any risk

Except for extremely specialised tooling, no one will ever need to deal with the complex mappings shared like this!

graybeal commented 2 years ago

So:

First you have to tell your ingest software to ignore those elements. Then if you ignore the elements, you don't get the mappings, right? So you think you have processed the mapping file successfully but really you missed some arbitrary (unknown) amount of the content. If I'm not getting that please clarify. (Of course what BioPortal does with a coordinated mapping statement like "Mild diabetic retinopathy" sameAs "Diabetic retinopathy and severity some Mild", I have no idea either. But I digress.)

I think this is a concrete concern: "You are requiring any adopting system to create or integrate complex algorithms to implement post-coordination, with all the potential challenges and inconsistencies that implies". I'm not sure it's a fair one because I'm not sure how complex the algorithm(s) will have to be, if they will have multiple sources of implementation, or require complex installation/integration procedures. But if we can agree this is an SSSOMC extension, what about the following to minimize the programming/devops required and the variability that might ensue?

  1. The additional components are fully expressed within the SSSOM file, not in a separate file. So you have an optional section (at the end of the mappings, let's say) that specifies a template_system (a controlled term I hope, or something equally well defined), a subject_id_template field that contains the entire template (I think it could be in one cell, but if that's too crazy then make it a section), and a subject_id_template_data section that looks like the example you've pointed to.
  2. There is a defined execution process (a command template) that says, for any template_system, the process can be executed by running template_system with the contents of subject_id_template against the subject_id_template_data, and the output that is produced will be the post-coordinated triples in SSSOM form.
  3. Any mapping file explicitly includes whether it is compliant SSSOM or an extended form, and optionally what minimum version of SSSOM is required to process it.

I'll stop here, I know I'm over-designing. I just to show by example how everything could be tied together and examinable in a single file, eliminating all sorts of coupling and versioning issues; consistent results could be expected across any two systems that have installed the named template_system; the results could be recognizable as mappings; and there could be some control put in place up front that forces all the possible template_systems in the universe to satisfy a common specification for their operation (so, zero-coding installations).

Whatever you can do to minimize the cost of implementation (for systems having to process SSSOM) and the cost of understanding (for users staring at an SSSOM file and wondering what it means) will increase adoption.

cmungall commented 2 years ago

@matentzn I actually wasn't thinking of adding any fields to sssom. All I'm proposing is a simple standard way of serializing or hashing an expressions as a CURIE.

there are multiple ways to distribute simple lookup tables alongside sssom - for example a simple templating system that maps tuple + pattern to the composed identifier. but sssom doesn't need to know about these. it's just another id as far as sssom is concerned.

matentzn commented 2 years ago

If you are proposing "a simple standard way of serializing or hashing an expression as a CURIE" then I totally misunderstood you in the call. Ok then, back to square one. What is your proposal then? How will we connect the "simple lookup table" to the sssom mapping file if not through metadata? through file naming or packaging conventions? Since the subject_id values wont be connected to actual (resolvable) term IRIs, it will have to be link somewhere to be interpretable - maybe subject_source?

matentzn commented 1 year ago

During a meeting I heard:

onset, disease, complex mappings -> is going to be a big issue moving forward. we want to capture that in SSSOM

Unfortunately I didn't document which meeting. Another reason to be better at recording provenance.

matentzn commented 1 year ago

Ok, so @cmungall correct me if I am wrong. Your proposal is this.

Rather than extending SSSOM, we define a convention based on uri parameters. (Standardising this with the Semantic Web folks around SSSOM will be virtually impossible, but we don't have to - its just a convention that does not require any changes to SSSOM and everyone can feel free to ignore it)

A complex mapping is defined using a URL query pattern solution.

http://purl.obolibrary.org/obo/oba/patterns/anatomical_trait?anatomical_entity=UBERON:123&quality=PATO:123

Thats it. The mapping provider may choose to bind http://purl.obolibrary.org/obo/oba/patterns/anatomical_trait to a service that will unfold the expression to RDF, for example:

curl -X 'accept: rdf/turtle' 'http://purl.obolibrary.org/obo/oba/patterns/anatomical_trait?anatomical_entity=UBERON:123&quality=PATO:123'

Which would return:

owl:equivalentClass [
  owl:intersectionOf (
    PATO:123
    [
      a owl:Restrictiion
      owl:svf UBERON:123
      owl:onProperty RO:111
    ]
  )
]

If the service is self-describing (defined with LinkML, for example), you could even look up what it does in swagger or some such.

In SSSOM TSV it would look something like this:

subject_id predicate_id object_id mapping_justification
obo.pattern:anatomical_trait?anatomical_entity=UBERON:123&disease=PATO:123 skos:exactMatch OBA:123 semapv:ManualMappingCuration

Its not a thing of extreme beauty, but its pracical would help us serve to goals:

  1. Keep complex mappings away from SSSOM and keep Semantic Web folks happy
  2. Develop a practical system independent of any specific templating infrastructure (DOSDP, OTTR, ROBOT templates) which could resolve the complex subject to something (anything, even a JSON blob!).
jamesaoverton commented 1 year ago

I've been thinking about this kind of technique, largely in the context of https://units-of-measurement.org/.

Canonical Form

There should be only one way to write the pattern URL. I think it should be more opaque rather than more transparent.

obo.pattern:anatomical_trait?anatomical_entity=UBERON:123&disease=PATO:123 is very wordy, and in English, it seems to allow for a different CURIE that would be equivalent: obo.pattern:anatomical_trait?disease=PATO:123&anatomical_entity=UBERON:123.

We could use numeric IDs for patterns and fixed order for the slots: pattern:12345?x=UBERON:123,PATO:123

For even more opaqueness and brevity, we could compress/encode the arguments: pattern:12345#uweE31d. That may be going too far.

Equivalent to Pre-Composed?

What if pattern:12345?x=UBERON:123,PATO:123 is equivalent to an existing OBO term FOO:1234? People have been pre-composing terms for a long time.

Offline Processing

Nico gives the example of using curl to get some Turtle, but we would also need an offline tool to take a batches (potentially large batches) of these into Turtle (or whatever format).

matentzn commented 1 year ago

obo.pattern:anatomical_trait?anatomical_entity=UBERON:123&disease=PATO:123 is very wordy, and in English, it seems to allow for a different CURIE that would be equivalent: obo.pattern:anatomical_trait?disease=PATO:123&anatomical_entity=UBERON:123.

Excellent point about the order resulting in two distinct concepts... Didnt think of that..

I thought about compression (base64 encoding) but felt it too away too much transparency.

We could use numeric IDs for patterns

Hmmm.. I still prefer them to be a bit readable.. You are sacrificing readability for data integration precision here, which I am not too sure I like the balance of.. I would prefer a pattern registry where all the obo.patterns are registered with a readable id.

fixed order for the slots

This is the real elephant in the room. I proposed something like this in my previous suggestions about how to deal with post composition (having a specific field in sssom with the ordered list of fillers for the post composed expression), and @cmungall shot it down.. I am a bit on the fence; I don't like any of the proposals very much so far, so for me its more about what I can tolerate the most :D

Offline processing 100%, great idea. It would need some knowledge of the patterns being processed (perhaps this obo wide pattern registry I hinted at) but we should provide it as a python library on pypi.. Nice point.

Let's see how @cmungall reacts to the "ordered parameter" suggestion. Your order sensitivity argument is compelling, but you could solve it by requiring alphanumeric ordering of query parameters..

cmungall commented 1 year ago

I think the canonical form issue is something that can be addressed:

but there will always be different expressions that have the same extent, including non-anonymous expressions (@jamesaoverton's example of an expression that is later assigned a named class).

This is all unavoidable, but fine. You just don't make the Unique Name Assumption. There can be surrounding tools that will infer equivalence and subsumption between expressions and either other expressions or named entities, that can be used to normalized files, eliminate trivial matches, but these should be optional, and considered best-effort.

I admit this is a little unsatisfying. While OWL doesn't have a UNA, we have a convention in most ontologies we do a best effort to follow the UNA.

We could make the URLs a bit shorter by assuming a fixed order and treating as a tuple. Serializations like protobuf do this under the hood. For this to work you need guarantees that ordering does not change, you can't later change your mind and insert intermediates or flip things around. This opens up a lot of possibilities for errors.

Of course, named parameters are no cast iron guarantee of analogous errors, but it's unlikely that someone will invert the semantics of (disease,anatomical_entity).

And ultimately the gain is IMO a bit marginal, the URLs are still ugly, and don't follow normal linked data idioms.

I also think we may find tuples limiting. The tuple limitation in DOSDPs has led to pattern proliferation. I think if the system allows for both optional and multivalued (but no nesting) it will cover a broader range of use cases.

Base64 encoding: I am open to that (I originally discounted hashing as not reversible, but base64 or any reversible encoding would work). I take Nico's point about transparency. But it would be very easy to expand on the command line etc. And it's not like In fact I am rapidly warming to this suggestion.

Offline processing: definitely! There should be no need for a dependency on a server. There does need to be a way of resolving a pattern/template/class to some kind of computable description, but that can be entirely static. Of course having a lightweight service would be a nice thing to have, but not a necessity.

Thanks everyone for engaging. This is a hard problem, there are difficult tradeoffs either way. But if we get this right, this solution could work for post-composition in general, not just in SSSOM.

matentzn commented 1 year ago

Ok lets pull some of the pieces of the URL apart. This is the basic grammar:

{registry}/{template}(/{template_system})?[?/]{fillers}

We have the "template portion", which is actually the part that most resembles a proper IRI. The grammar of the template portion is {registry}/{template}, where registry is a service registry that can perform "template instatiation". The server can implement any pattern system they like (from dosdp, to linkml, to ROBOT, to OTTR) - we don't care from a user perspective, but if we ask for "content-type:rdf" the server should be able to return the correctly formatted instance in RDF.

{registry}/{template} variants:

Maybe we can punt on the decision of which of the two is better and leave this up to the user - we can negotiate a convention for OBO/biolink etc in a smaller circle.

Now to the filler part:

[/?]{filler} variants:

EDIT 9. November 2022: Added (/{template_system})? as an optional parameter to capture template systems like dosdp, ottr, robot_template, etc.

matentzn commented 1 year ago

Ok, just as a warning, we will probably solve this issue as follows:

  1. No changes to SSSOM, only a documentation page that suggests to use the general architecture {registry}/{template}(/{template_system})?[?/]{fillers} for encoding post-composed expressions according to the above spec.
  2. We will likely not agree on the exact syntax on the {fillers} part, so we will, instead, create a project-specific proposal (Monarch) using either F1 or F1+F3 for the fillers. This is more @cmungall preference, for me F2 is still in the race, but I am happy to first trial this for Monarch specific use cases and see what happens.
  3. Chris suggested that you do not even have to provide a web-service. You could simply provide a static file stricture like this:
-- 00001 # or abnormal_anatomical_entity, if you don't like numeric identifiers for templates
---- index.html
---- dosdp #the dosdp file representing the template
---- robot #the robot_template representing the template
---- ottr #the ottr template representing the template

The client can then call http://obofoundry.org/patterns/000001/robot/P2ZpbGxlcnM9VUJFUk9OOjEyMyxQQVRPOjEyMw== to refer to the pattern and instantiate it locally with, for example, robot CLI (or dosdp-tools or whatever). This puts a bit of work on the client side, but makes versioning of the patterns a bit easier (you can just store them in version control, and you do not need API versioning). Its 100% better than what we have right now, which is, no solution.

matentzn commented 1 year ago

Anyways, none of this affects SSSOM directly, but it is hugely important to the mapping community to find some way to distribute these. I guess you could be a horrible person and do: http://obofoundry.org/patterns/000001/ofn/P2ZUHIUYAGUJHGUY where P2ZUHIUYAGUJHGUY is a valid OWL class expression in OWL functional syntax :D

jamesaoverton commented 1 year ago

Are you planning to use the OBO PURL system http://purl.obolibrary.org/? If not, why not?

I don't quite understand how the static files would work for a ROBOT template.

matentzn commented 1 year ago

The OBO purl was just an example. Of course I would use the OBO PURL system if it was something OBO related!

The point with the static files is that they would not "work" - they are like executable documentation. If I were to write a client for ROBOT template (which I would), I would basically

  1. Read the static file as a table
  2. Use row 1 (not 2) to find the correct column
  3. Paste the values in the template
  4. run robot template to generate some OWL

A webservice system would of course be able to hide the details of the templating system, but possibly at the cost of making versioning and maintenance harder.

callahantiff commented 1 year ago

@matentzn -- hoping to make some headway on this later this week via the examples we discussed for the proposal and here.

tayeb83 commented 1 year ago

Hi @matentzn we would to implement sssom as a main model to publish our mappings. We are going to have as @tudorache wrote, multiple mappings from simple to complex combination (icd-11 postcoordination). Can we think to a solution where we can use a blank node such as :

sssom:source_id ICD10:J10.0;
sssom:object_id _:b;
_:b owl:equivalentClass [
owl:intersectionOf (
    ICD11:1E30
    ICD11:XN5SG
  )
]

As a concrete example we can obtain using owl :

mappingcim10_cim11:MappingJ1001E30XN5SG
  a sssom:Mapping ;
  sssom:mapping_justification "Manuel" ;
  sssom:object_id "1E30&XN5SG" ;
  sssom:object_source [
      a owl:Class ;
       owl:intersectionOf (
          <http://id.who.int/icd/release/11/mms/1418788600>
          <http://id.who.int/icd/release/11/mms/753780243>
        ) ;
    ] ;
  sssom:predicate_id "skos:exact" ;
  sssom:subject_id "J10.0" ;
  sssom:subject_source "http://data.esante.gouv.fr/atih/cim10/J10.0"^^xsd:anyURI ;
  rdfs:label "Mapping_J100_1E30XN5SG" ;
.

which can be viewed on tobraid-edg for example like this :

image

Or you have a better solution ?

matentzn commented 1 year ago

@tayeb83 thanks for reaching out. @callahantiff and a few of us will propose a way to capture these kinds of mappings on the 23rd of April (during a SSSOM workshop on "non simple mappings") - if you are interested to work with us on our proposal, you can reach out via email or linked in, and I will share the docs with you (we anticipate some resistance, so we have not shared it yet).

The question is mostly what exactly should go into the object_id slot. We wont reach universal agreement here on SSSOM level, but we hope for a nice convention that strikes a balance between interpretability and expressiveness. We wont suggest to embed the whole anonymous expression in the sssom file - we advocate for decoupling logical concerns from SSSOM in externally defined pattern files.

As an aside, you won't violate the SSSOM spec when base64-encoding a class expression in, say, owl functional syntax and stick this into the object_id - we will discourage this in our proposal, but, you could, in theory, do this.

tayeb83 commented 1 year ago

@matentzn sure!! very interested to work with you on it (with my team), it's urgent for us since we make the choice to manage mappings in the https://smt.esante.gouv.fr/ using sssom. my linkedin : https://www.linkedin.com/in/tayebmerabti/ Thanks !!

callahantiff commented 1 year ago

@tayeb83 thanks for reaching out. @callahantiff and a few of us will propose a way to capture these kinds of mappings on the 23rd of April (during a SSSOM workshop on "non simple mappings") - if you are interested to work with us on our proposal, you can reach out via email or linked in, and I will share the docs with you (we anticipate some resistance, so we have not shared it yet).

The question is mostly what exactly should go into the object_id slot. We wont reach universal agreement here on SSSOM level, but we hope for a nice convention that strikes a balance between interpretability and expressiveness. We wont suggest to embed the whole anonymous expression in the sssom file - we advocate for decoupling logical concerns from SSSOM in externally defined pattern files.

As an aside, you won't violate the SSSOM spec when base64-encoding a class expression in, say, owl functional syntax and stick this into the object_id - we will discourage this in our proposal, but, you could, in theory, do this.

Looking forward to working on this with you both!

samsontu commented 1 year ago

Several countries (Canada, Australia, Germany, among them) have projects mapping their ICD-10 national modifications to ICD-11. Recently, a “mapping task force” was convened in the WHO Family of Classifications Network. In its initial meeting, I raised the possibility of using SSSOM to standardize the mapping format produced in these efforts. Post-coordination is a huge issue, as national modifications of ICD-10 are invariably more granular than the international version. I’d be interested in how your proposal can be used for these cases.

With best regards,

Samson

On Feb 28, 2023, at 5:35 AM, Tiffany J. Callahan @.***> wrote:

@tayeb83https://github.com/tayeb83 thanks for reaching out. @callahantiffhttps://github.com/callahantiff and a few of us will propose a way to capture these kinds of mappings on the 23rd of April (during a SSSOM workshop on "non simple mappings") - if you are interested to work with us on our proposal, you can reach out via email or linked in, and I will share the docs with you (we anticipate some resistance, so we have not shared it yet).

The question is mostly what exactly should go into the object_id slot. We wont reach universal agreement here on SSSOM level, but we hope for a nice convention that strikes a balance between interpretability and expressiveness. We wont suggest to embed the whole anonymous expression in the sssom file - we advocate for decoupling logical concerns from SSSOM in externally defined pattern files.

As an aside, you won't violate the SSSOM spec when base64-encoding a class expression in, say, owl functional syntax and stick this into the object_id - we will discourage this in our proposal, but, you could, in theory, do this.

Looking forward to working on this with you both!

— Reply to this email directly, view it on GitHubhttps://github.com/mapping-commons/sssom/issues/108#issuecomment-1448190910, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AANN5RGOHZBBJ5ODGLASP63WZX5K3ANCNFSM5I2V26EQ. You are receiving this because you were mentioned.Message ID: @.***>

matentzn commented 1 year ago

@samsontu This is very relevant for us to inform our work. Could share with us 10 mappings in whatever format which use post coordination, so we can make sure our proposal still works for these cases?

@tayeb83 Please let me know how we can drive your issue forward - happy to meet as well (next week)

samsontu commented 1 year ago

I don’t have the mappings myself. I'll forward your request to to the relevant people in the WHO-FIC community.

With best regards, Samson

On Mar 1, 2023, at 6:00 AM, Nico Matentzoglu @.***> wrote:

@samsontuhttps://github.com/samsontu This is very relevant for us to inform our work. Could share with us 10 mappings in whatever format which use post coordination, so we can make sure our proposal still works for these cases?

@tayeb83https://github.com/tayeb83 Please let me know how we can drive your issue forward - happy to meet as well (next week)

— Reply to this email directly, view it on GitHubhttps://github.com/mapping-commons/sssom/issues/108#issuecomment-1450198124, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AANN5RCS756HELBGXQZ65DDWZ5JAFANCNFSM5I2V26EQ. You are receiving this because you were mentioned.Message ID: @.***>

cmungall commented 1 year ago

this may be a good alternative to encoding param value pairs as http params: https://en.wikipedia.org/wiki/JSON%E2%86%92URL

cmungall commented 1 year ago

Linking to slides from SSSOM workshop which relate to this issue https://docs.google.com/presentation/d/1kFD33S_WMgEGmCnT7IjVCeEyKI7OpcUw1ZzRXGqt1hs/edit#slide=id.g22c799fa946_0_0