nfdi4plants / Swate

Excel Add-In for annotation of experimental data and computational workflows.
https://swate-alpha.nfdi4plants.org
MIT License
31 stars 6 forks source link

[BUG?] Does not parse underscores in ontology accessions. #210

Closed Freymaurer closed 2 years ago

Freymaurer commented 2 years ago

Describe the bug I am currently working with some placeholder terms and i named some of them in the following scheme "KF_PH:001".

Then i noticed that swate ontology accession parsing does not work as expected with underscores. Now the question is: Are underscores in accessions allowed? @Brilator @kdumschott @AngelaKranz

Reminder for me where the issue lies:

https://github.com/nfdi4plants/Swate/blob/f3be11c6955e512e8b0e7acfc3b13a2b2144f88c/src/Shared/Regex.fs#L45

[:_] will match the first _ and the following letters match due to [a-zA-Z0-9]+, emphasis on "a-zA-Z"

Brilator commented 2 years ago

Do not use CamelCase, do_not_use_underscores

https://obofoundry.org/principles/fp-012-naming-conventions.html

Brilator commented 2 years ago

Plus: I'd recommend opening the ontology file (obo or owl) in protege once in a while for (automatic) validation.

Freymaurer commented 2 years ago

@Brilator Thanks for the fast reply! I looked into the link you provided, but for me it reads as follows:

primary label/accession: rdfs:label. In my example case rdfs is KF_PH and label is 001. Both together are called primary label.

write labels, synonyms, etc as if writing in plain English text. ie use spaces to separate words, only capitalize proper names (e.g. Parkinson disease). Do not use CamelCase, do_not_use_underscores

does not seem to refer to primary labels, but to what we call "term names", for example "instrument model".

Could this be correct? Or could you please point me to a more clearer defined definition?

Brilator commented 2 years ago

Sorry, this was too quick of a google-copy/paste.

Just tested adding a term with id containing an underscore and protege didn't complain. Still not sure, wether generally allowed and might depend on context.

Now the question is, what part of "swate ontology accession parsing" does not work and what (converter?) is involved

Brilator commented 2 years ago

Maybe also check https://obofoundry.org/id-policy

Freymaurer commented 2 years ago

The issue is a bit too in-depth to explain. I just need to know if i need to rewrite the logic to support underscores or not 😅 or to rephrase it: Are underscores allowed, then i MUST rewrite it or are they not allowed, in this case i MUST NOT rewrite it 😄

Freymaurer commented 2 years ago

From the code shown here s@((**[A-Za-z_]***):(d+))@http://purl.obolibrary.org/obo/$1_$2@:

This does not look good for our nfdi4pso.

@muehlhaus maybe this should be a discussion point on future meetings. For now i will keep numbers in the swate internal parsing.

Freymaurer commented 2 years ago

Will close this and open a new issue at the ontology repo, due to the nfdi4pso-containing-a-number issue.