DigitalSlideArchive / digital_slide_archive

The official deployment of the Digital Slide Archive and HistomicsTK.
https://digitalslidearchive.github.io
Apache License 2.0
105 stars 49 forks source link

Provision for search query with OR operation between metadata keys #253

Closed aatif1992 closed 1 year ago

aatif1992 commented 1 year ago

Digital Slide Archive provides provision for Quick Search over metadata fields. We can specify queries spanning over multiple keys.

As an example, specifying the query key:study_id s123 key:tissue_type nervous will search images with metadata keys study_id=s123 AND tissue_type=nervous. When we specify multiple keys, by default search filters the images by conjunction (AND operator) on metadata key values.

Does search supports queries with OR operator? E.g. "key:tissue_type nervous OR key:tissue_type muscle" to search images matching either nervous or muscle tissue.

manthey commented 1 year ago

The database supports this, and there are some API endpoints that can have custom queries constructed to do so, but it isn't exposed in any manner to the UI.

We'd have to extend the DSL used for the query that gets parsed here.

aatif1992 commented 1 year ago

Thanks for the reply! Can you please provide reference to those API endpoints for creating custom queries?

manthey commented 1 year ago

There are specific endpoints for passing a mongo query through to items or folders (still with full permission logic). For instance, the item endpoint is GET item/query (see code https://github.com/DigitalSlideArchive/HistomicsUI/blob/master/histomicsui/rest/system.py#L164). As an example: {"$or": [{"meta.tissue_type": "nervous"}, {"meta.tissue_type": "muscle"}]}.

dgutman commented 1 year ago

Two key points to note here:

One: This is using MONGO queries against the data base. Be careful about serializing/deserializing your query from/to JSON. Depending if your using the API directly via the SWAGGER interface vs using the girder_client to hit the query endpoint, I've had some annoying (initially) confusing cases where what I thought was a JSON object was interpreted as a string instead, and so was not working.

2) Note the meta.tissue_type ... i.e. META... basically we are querying what's attached to item.meta... so all of the queries have to have the meta.tag before the key, although this is nice as you can query meta.stain.antibody_vendor if you have more complicated types of metadata attached to an item..

On Tue, Feb 28, 2023 at 3:07 PM David Manthey @.***> wrote:

There are specific endpoints for passing a mongo query through to items or folders (still with full permission logic). For instance, the item endpoint is GET item/query (see code https://github.com/DigitalSlideArchive/HistomicsUI/blob/master/histomicsui/rest/system.py#L164). As an example: {"$or": [{"meta.tissue_type": "nervous"}, {"meta.tissue_type": "muscle"}]}.

— Reply to this email directly, view it on GitHub https://github.com/DigitalSlideArchive/digital_slide_archive/issues/253#issuecomment-1448812101, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFODTXCLAAYUEOBJMPMJOLWZZLJLANCNFSM6AAAAAAVHGF5ME . You are receiving this because you are subscribed to this thread.Message ID: <DigitalSlideArchive/digital_slide_archive/issues/253/1448812101@ github.com>

-- David A Gutman, M.D. Ph.D. Associate Professor of Neurology Emory University School of Medicine

aatif1992 commented 1 year ago

Thanks @manthey for precisely pointing to the API for this use case. Thanks @dgutman for your point about meta subkeys.

For future reference:

  1. One need to GET the JSON returned by /item/query API call for the query with OR: param_query = {"$or": [{"meta.tissue_type": "nervous"}, {"meta.tissue_type": "muscle"}]} Example API call: "http://localhost:8080/api/v1/item/query?query=" + param_query + "&limit=50&sort=_id&sortdir=1"

  2. Parse the response JSON for item ids (key "_id"), which can be presented as item results: "http://localhost:8080/#item/" + item_id