nfdi4plants / Swate

Excel Add-In for annotation of experimental data and computational workflows.
https://swate-alpha.nfdi4plants.org
MIT License
31 stars 6 forks source link

[BUG] Link to ontology ref in anno headers. #100

Closed Brilator closed 3 years ago

Brilator commented 3 years ago

Describe the bug (Not sure, wether this is a bug, feature request, misunderstanding or curiosity)

How is the link of added annotation building blocks (header names) to the respective ontology ref maintained? This cannot be simply by name, right? Especially for those, where "ontology" is picked for filling variable in the validator.

To Reproduce

  1. Add annotation building block with ontology ref to a SWATE xlsx.
  2. Unzip xlsx
  3. Search for added ontology ref in children files.

OS and framework information (please complete the following information):

OS: macOS Big Sur OS Version 11.1 MS Excel: Excel Desktop MS Excel Version 16.43

Freymaurer commented 3 years ago

I am not sure if i understand your question correctly. So you want to know how we can assure that the term in the main column is correctly referenced in the hidden columns (Term Source Ref and Term Accession Number)?

Brilator commented 3 years ago

Basically, yes. I’m wondering where the ontology ref is stored. Is it possible (ISA compliant) to add the accession to the column headers?

Brilator commented 3 years ago

Actually this goes somewhat in the same direction as #96. I just wanted to make sure -before SWATE-annotating large tables-, that this info isn't lost.

Freymaurer commented 3 years ago

No problem, just to be sure:

image

In this example, are you wondering how we know that the term accession number is correct for the parameter (Bruker Daltonics HCT Series, in this case) or are you wondering where the information for the header (Paramater [instrument model], as instrument model is an ontology itself) is stored?

Brilator commented 3 years ago

Sorry, I could've shared a screenshot. The latter - where's the info stored?

Freymaurer commented 3 years ago

The only info stored is actually shown in the table, but:

image

In this example i first extended the table with two additional rows and pulled down the bruker value. Then i removed some letters from the last term, changed one TSR value (MSS) and changed some numbers in TAN (should be 1000697). Is this the case you worry about? To not create too much of an maintenance overhead we currently handle these kind of errors via this function:

image

This will check all terms in the main column (here: Parameter [instrument model] and check the database for the related TSR and TAN should the term exist. After clicking, the table looks like this:

image

Does this answer your question?

Brilator commented 3 years ago

I didn't mean the values (i.e. info in the rows like Bruker, MS, etc.) but the keys (column headers like instrument model). Where are TSR and TAN of "instrument model" stored? The "Use related term search" is not only based on the string ("instrument model"), but on TAN or not? If it was string only, this would have to be limited to one ontology to prevent duplicates.

Freymaurer commented 3 years ago

Ah sorry, currently we do not store TSR and TAN for ontologies in the column header. The "Use related term search" parses the header or the selected column to check for a ontology string. This string is then used to a is_a directed search.

E.g. if i change a letter in "instrument model" to "instrument moel" the search will likely return 0 results as a entry for "instrument moel" does not exist in the database.

What are your opinions on having a TSR or TAN for these headers saved? We could add them as an (#tag) in the reference (hidden) columns.

Brilator commented 3 years ago

Oh! Yes, I would definitely suggest to add (tag) them in the columns and then search by TAN. How else would SWATE handle duplicates originating from different ontologies?

Freymaurer commented 3 years ago

Currently if this occurs both terms would be used for a is_a directed search.

But we should try to avoid duplicates as this would also mess up a lot of other functions. E.g. the above shown term search: https://github.com/nfdi4plants/Swate/issues/100#issuecomment-775780152

Freymaurer commented 3 years ago

The next release (propably 0.3.0) will now add a term accession value tag to reference columns: #tXX:aaaaaaa

this tag will be used for parentOntology term directed search and to fill in columns.