mapping-commons / sssom

Simple Standard for Sharing Ontology Mappings
https://mapping-commons.github.io/sssom/
BSD 3-Clause "New" or "Revised" License
147 stars 24 forks source link

Manage SSSOM CV as LinkML enums #65

Closed matentzn closed 2 years ago

matentzn commented 3 years ago

I am thinking of ditching https://github.com/mapping-commons/SSSOM/blob/master/sssom_vocab.tsv in favour of linkml enums. That would make it easier to use simple strings such as HumanCurated rather than the annoying SSSOM:HumanCurated.. There are so few terms in there, and even if we add 50, the schema can take it IMO. Any objections?

cthoyt commented 3 years ago

I kind of like having all of the different terms being namespaced. Plus, making this rely more on linkml means that fewer external programs will be easily able to use it

matentzn commented 3 years ago

Intrinsically they could be, with the default sssom namespace

matentzn commented 3 years ago

I think so far I have these arguments here to use enums:

  1. In order for downstream tools to make proper sense of the match types, they should be fixed - not extensible. It would be a major hassle to having to track down peoples extensions to decide what to do with them and where they belong.
  2. These lists should be small to be useful
  3. They should not blow up the size of the TSV needlessly. why SSSOMC:HumanCurated if HumanCurated means the exact same thing, given that the column will be interpreted as being in the SSSOMC namespace.
  4. I consider these part of the SSSOM meta model. From a management perspective I would not like to see it split over too many files.
cmungall commented 3 years ago

Plus, making this rely more on linkml means that fewer external programs will be easily able to use it

How so? It would make it more usable. We could still generate whatever TSVs we have now. But in addition we would have

cmungall commented 3 years ago

The main reason against is that if I want to "locally extend" the set of enums, the toolchain will not work (error will be thrown when trying to instantiate python model). But this could be seen as a benefit

cthoyt commented 3 years ago

Plus, making this rely more on linkml means that fewer external programs will be easily able to use it

How so? It would make it more usable. We could still generate whatever TSVs we have now. But in addition we would have

* a standard computable representation

* autogeneration of python enums

* constraints directly in the rendered json schema

* ...

If that's the case, then I don't have any concerns! Thanks for clarifying.

matentzn commented 2 years ago

We made a 360 on this issue and now have this managed her: https://github.com/mapping-commons/semantic-mapping-vocabulary