PavlidisLab / GemBrow

The Gemma Browser
Apache License 2.0
4 stars 1 forks source link

Support negation of entire categories #109

Open ppavlidis opened 3 months ago

ppavlidis commented 3 months ago

I'm not sure how we would present this to users, we've discussed handling negation better before, but it came up as a use case:

Find all experiments that don't involve a disease i.e. lack a disease annotation.

or

Find data sets that have control samples for a Disease factor

We usually hide annotations like "Reference role" from the browser and I think that is still a good call.

That kind of thing might make sense if you want "normal" samples / data sets, by some definition of "normal".

arteymix commented 2 months ago

We need to handle negative subqueries in the backend for this kind of feature to work.

Right now, for a filter that involve a subquery like (i.e. characteristics.category = disease), we do something like:

select * from datasets where id in (select id from datasets join datasets.characteristics  c where c.category = 'disease')

Using a negative clause (i.e. characteristics.category != disease) would result in:

select * from datasets where id in (select id from datasets join datasets.characteristics  c where c.category <> 'disease')

But that's not what is being expected: it will return datasets that have at least one characteristic that does not have the URI in question. Instead, we should do something like:

select * from datasets where id not in (select id from datasets join datasets.characteristics  c where c.category = 'disease')

For the syntax, I'm thinking of something like:

none(characteristics.category = disease) no characteristic match the predicate
any(characteristics.category = disease) at least one characteristic match the predicate (current behavior)
all(characteristics.category = disease) all characteristics match the predicate
arteymix commented 2 months ago

I would start only with any and none. The all can be obtained by negating the filter with none, but we do not support negating the in operator. This will require https://github.com/PavlidisLab/Gemma/issues/1192.

arteymix commented 2 months ago

The backend now supports quantifiers with the syntax above. It's now possible to implement the requested feature.