Open will-moore opened 2 months ago
As indicated previously. the searchengine will not work for the generic mapr
First bullet-point above, we need to know "How many values are there for key: Organism
"? ("childCount": 72
) And we may want to filter by e.g. value=Homo%20sapiens
which will likely return "childCount": 1
.
For the 2nd bullet-point, we need to know "Give me all the values for key: Organism
and for each value
we also want the number of containers and the total number of images in those containers.
I'm not sure that current searchengine API has endpoints/queries that can supply these data? cc @khaledk2
The following URL will return JSON which contains all the values in addition to the number of images in each bucket (value) https://idr.openmicroscopy.org/searchengine//api/v1/resources/image/searchvaluesusingkey/?key=Organism
This PR https://github.com/ome/omero_search_engine/pull/63 contains "in" operator. So, using the following query:
{
"resource":"image",
"query_details":{
"and_filters":[
{
"name":"Organism",
"operator":"in",
"query_type":"keyvalue",
"resource":"image",
"value":"homo sapiens, scapania spitsbergensis, scapania compact, ....."
}
],
"case_sensitive":false,
"or_filters":[
]
}
}
with this URL: https://idr.openmicroscopy.org/searchengine//api/v1/resources/submitquery/containers/
will return the required data for the second point.
@khaledk2 Thanks. Is https://github.com/ome/omero_search_engine/pull/63 deployed somewhere? I can see apidocs at https://idr.openmicroscopy.org/searchengine/searchengine/apidocs/ but it's not there at idr-testing or idr-next?
One issue I noticed above is that everything in searchengine is lowercase, so there's no Homo sapiens
etc. Not sure how to work around that?
@will-moore I have deployed the https://github.com/ome/omero_search_engine/pull/63 on the idr-testing
.
The key/value pairs are case-sensitive, it is saved inside the elasticsearch indices as it is in the idr database. The user has the option to make the query case-sensitive or not.
For example, the following query will not return the result as the case-sensitive
attribute is set to true with the value true
.
It will return the results if the attribute is set to false
or set value to Homo sapiens
{
"resource":"image",
"query_details":{
"and_filters":[
{
"name":"Organism",
"operator":"in",
"query_type":"keyvalue",
"resource":"image",
"value":"homo sapiens, scapania"
}
],
"case-sensitive": true,
"or_filters":[
]
}
}
The /searchvaluesusingke
y returns the normalized values, we can modify it to return the actual values.
There are a handful of queries that attempt to load a large amount of data on a large server such as IDR.
These can be slow and need to be cached.
One option is to use the https://github.com/ome/omero_search_engine to perform these queries.
Currently, the searchengine is accessed via web api only, (it's not installed in the omero-web python environment). We don't want to make
omero-mapr
dependent on the searchengine since this will be a breaking change for all the mapr users who don't have searchengine installed. So we need to test for the availability of searchengine.First step is to list the API calls that are most problematic:
/mapr/api/organism/count/?page=1&group=3
(e.g. used for root tree node at https://idr.openmicroscopy.org/mapr/organism/) possibly filtering by value:/mapr/api/organism/count/?value=Homo%20sapiens&page=1&group=3
(used by root tree node at https://idr.openmicroscopy.org/mapr/organism/?value=Homo%20sapiens). Both return similar response:/mapr/api/organism/?case_sensitive=false&orphaned=true&experimenter_id=-1&page=1&group=3
possibly filtering by value:/mapr/api/organism/count/?value=Homo%20sapiens&page=1&group=3&_=1715773803746
. Used for top-level nodes (children of root) at URLs above.