PavlidisLab / Gemma

Genomics data re-analysis
Apache License 2.0
23 stars 6 forks source link

Origin of terms in the inferredTerms component #1125

Open oganm opened 4 months ago

oganm commented 4 months ago

In the current design all inferred terms are returned in a single array

https://staging-gemma.msl.ubc.ca/rest/v2/datasets?filter=allCharacteristics.valueUri%20%3D%20http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FCL_0000540

This means it is not possible to identify which term an inferred term originated from without parsing the generated filter and even then, we'd have to make assumptions about how the original filter was constructed. It would have been helpful to see why a particular term shows up where it does

arteymix commented 4 months ago

Terms are grouped by OR clauses for querying their children. The best I can do in this scenario is indicating which subclause an inferred term derived from.

Inferred terms could also overlap multiple subclauses.

oganm commented 4 months ago

I assume the subclause indicator here would require one to have the original input at hand to make sense? what i was envisioning was something like

{"http://term_included_in_the_original_query":[{valueUri: ---, value: ---, ...}]}

how would you specify a subclause instead?

arteymix commented 4 months ago

I proposed to expose the filter as a VO, so that you do not need to parse it. I might include inferred terms in that datastructures instead of as an ad-hoc object.