nih-cfde / cfde-deriva

Collaboration point for miscellaneous CFDE-deriva scripts
Other
2 stars 3 forks source link

Reenable search result counts in portal #217

Open ACharbonneau opened 3 years ago

ACharbonneau commented 3 years ago

Summary

For quite some time, the portal has disabled the calculation and display of search result counts to avoid resource exhaustion and timeout errors. We would like to restore the counts display as it is very helpful to the data exploration UX.

Status

Work continues on this, focusing on a different query and DB denormalization strategy. The work is more general and should speed up the recordset searches in general, with an initial goal to then reenable dynamic counts based on these faster queries.

We're integrating under the umbrella of "array ops" at the engineering level:

The objective of these changes is to change the query regime used by the deriva stack to minimize (and in many cases eliminate) the need for table joins. Instead, the use of facets will express existentially quantified value list constraints against multiple arrays in the core C2M2 table (e.g. file or biosample) and PostgreSQL will be able to use a query plan intersecting the per-facet array columns' indexes.

Original issue text

The portal does not tell the user a count of how many records match the current query criteria.

image

I suspect this was part of Karls optimization work, and I vaguely remember discussing it, but I don't remember the details. Unfortunately, it makes it impossible to tell how many results I have when i'm searching unless I get it below 25. That's really a problem. We need to find a way to give the user an idea what they've searched without making the portal crash :/

karlcz commented 3 years ago

That's right, this was to avoid the expensive full scans which generate counts. It is trivial to turn back on, but it's an all-or-nothing choice I am afraid. We do not have any cheaper way to do it on the drawing board...

ACharbonneau commented 3 years ago

Given that we don't have any real users right now, I think I'd err on the side of not crashing the portal, but I would like to try to get something on the drawing board :)

RLC-DCPPC commented 2 years ago

Need this if at all possible for April demo (maybe on collections page)

RLC-DCPPC commented 2 years ago

This is related to a deliverable due by April 30, 2023

karlcz commented 2 years ago

I've updated the issue description with a summary of the current work on this result counts topic.

karlcz commented 1 year ago

The initial work for this is deployed in the app-dev catalog "1" test environment. This uses the new "fast filter" queries for the main result set and the result set count in the recordset app.

We are continuing to investigate additional optimizations we might be able to apply in the chaise UI to make more use of the new "fast filter" query forms. This could reduce the cost of dynamic updates to the individual faceting controls and selection modals from the left filtering side-bar.

karlcz commented 1 year ago

Another round of optimizations were deployed to app-dev. This needs more real-world testing for performance with different usage of the recordset app, but looks good so far.