EBISPOT / DUO

Ontology for consent codes and data use requirements
Other
62 stars 15 forks source link

Would it make sense to keep one DUO.owl source file per language #88

Open jimmyhli opened 3 years ago

jimmyhli commented 3 years ago

Right now, there's support for English and Japanese, and having them in the same owl file is probably fine. What if we add support for more languages in future? Would it make more sense to have a different language to be in their own localized copy of owl file?

I understand that this might be more logistics work, but keeping the source in English only might help us keep the weight light in the long run

mcourtot commented 3 years ago

The Japanese version is the only on we have for now, so we decided to merge it in at release time. However the source itself is in its own file, see https://github.com/EBISPOT/DUO/blob/master/src/ontology/duo-japanese.owl

The merge happens in the makefile, https://github.com/EBISPOT/DUO/blob/718a288ef032a04297d142138a624654f4c90429/src/ontology/Makefile#L26

We'll do the same for future languages, exactly for the reason you mention, and also so that each translation can refer to a specific dated version of DUO.

jimmyhli commented 3 years ago

Cool, thanks! Is there an equivalent to https://github.com/EBISPOT/DUO/blob/master/src/ontology/duo.owl, but with only English labels? Thanks again

mcourtot commented 3 years ago

The process to create the OWL file is as follow:

  1. duo-edit.owl is the editing version
  2. duo.owl is built by triggering the makefile manually, and merges in duo-edit and the japanese version (at the moment) (The makefile is also used to generate the csv files, duo-basic etc)
  3. once ready for release, the files are copied over from /src/ontology to the top-level and tagged/pushed for release

When I edit duo-edit.owl, I usually trigger the makefile as it has a couple checks, eg reasoning, built in, but there is no guarantee that this is done, so the safest is to use the duo.owl file at the root of the directory (the files under/src/ontology are working files and may change before release)

Is there a specific reason you want only the english labels?

jimmyhli commented 3 years ago

Oh cool, I see. The reason is the ontology parser I use, it only supports parsing one definition out, and coincidentally, it parses the last definition out, which happens to be the Japanese definition, I actually don't have a way to let it return the English definition unless the source is in English only

Related issue was opened in https://github.com/althonos/pronto/issues/104

mcourtot commented 3 years ago

Hi @haoyuanli - I have not forgotten about this and the short answer is I don't know how to best support your use case. We author DUO in OWL which is the W3C standard for ontologies, and one advantage is the multi-language support. It's unfortunate the tool you are trying to use relies on OBO, but it may make more sense for them to upgrade (OBO is not really maintained very much, and OBO-edit development has stopped) or to use another tool such as ROBOT or the OWLAPi directly depending on your needs. The OBO to OWL mapping doesn't seem to cover multi-language very well (and as far as I remember OBO is ascii only, so wouldn't work for Japanese anyway) but I seem to remember we weree able to have a couple in the GO. Pinging @cmungall if he has any insight.

What I would recommend in the meantime, is to use the latest official release which is english only. As we are now starting to think about additional languages, we may also want to produce english only at release time, as well as all languages. While the latter would be the version linked from the PURL we could produce an english only version at the same time.

jimmyhli commented 3 years ago

Hi @haoyuanli - I have not forgotten about this and the short answer is I don't know how to best support your use case. We author DUO in OWL which is the W3C standard for ontologies, and one advantage is the multi-language support. It's unfortunate the tool you are trying to use relies on OBO, but it may make more sense for them to upgrade (OBO is not really maintained very much, and OBO-edit development has stopped) or to use another tool such as ROBOT or the OWLAPi directly depending on your needs. The OBO to OWL mapping doesn't seem to cover multi-language very well (and as far as I remember OBO is ascii only, so wouldn't work for Japanese anyway) but I seem to remember we weree able to have a couple in the GO. Pinging @cmungall if he has any insight.

What I would recommend in the meantime, is to use the latest official release which is english only. As we are now starting to think about additional languages, we may also want to produce english only at release time, as well as all languages. While the latter would be the version linked from the PURL we could produce an english only version at the same time.

Hi Melanie, thanks again for the reply. I will try to use the latest official release and open to alternate tools here as well.

mcourtot commented 3 years ago

@haoyuanli odd request but I can't find your email address - can you please email me mcourtot [at] ebi.ac.uk?

jimmyhli commented 3 years ago

@haoyuanli odd request but I can't find your email address - can you please email me mcourtot [at] ebi.ac.uk?

just did! hehe :)