Semantic facet optimization idea: estimate for all possible facet values

Problem Description: Copied from #362

Semantic facet requests can timeout, even after the 59 seconds they are allotted. Unknown how likely users would use these facets should they take more than n seconds to appear.

An extreme example takes no less than 4.4 minutes: the responsible collections facet for an estimated 15 million items curated by Yale University Library (YUL) --a boil the ocean search that LUX is asked to calculate facets for. Using the techniques described below, this facet request completed in 10 seconds.

Expected Behavior/Solution: For the two semantic facets we have today (responsible units and collections):

Start with all possible facet values. Collections is the largest with 64.
Incorporating the user's search criteria, calculate the estimate for each of the facet values.
Filter out facet values that have an estimate of zero values.

Results for the extreme example using the above technique:

11.3.0-q17-cts-speed-demon firstRun=2815 warmRuns=10 warmMin=238 warmMax=973 warmAvg=332 stddev=215 totalItemsRead=198

All durations are in milliseconds, meaning the slowest it becomes to calculate the responsible collections facet for ~15 million items curated by YUL is 2.815 seconds 💥

Results for the responsible collections facet for 675K items matching the "journals" keyword search:

11.3.0-q17-cts-speed-demon firstRun=1684 warmRuns=10 warmMin=149 warmMax=166 warmAvg=154 stddev=5 totalItemsRead=572

Yep, 1.7 seconds with empty caches :)

Once implemented, I don't believe a semantic facet request would ever time out.

Implementation, within the context of the CTS benchmark template: q17-cts-speed-demon.js.txt. CTS queries for both examples are included. Set baseSearchCriteria to journalsBaseSearchCriteria or curatedByYulBaseSearchCriteria.

To implement within LUX, we'd need to update the search criteria within facetsViaSearchConfig.mjs and the approach within facetsLib.mjs.

Requirements: See above.

Needed for promotion: If an item on the list is not needed, it should be crossed off but not removed.

~- [ ] Wireframe/Mockup - Mike~

[ ] Committee discussions - Sarah
[ ] Feasibility/Team discussion - Sarah
[ ] Backend requirements - See the "Expected Behavior/Solution" section. ~- [ ] Frontend requirements - TBD~
[ ] Are new regression tests required for QA - Amy
[ ] Questions
List of questions for discussions. Answers should be documented within the issue.

UAT/LUX Examples:

The following are searches. If you are testing this ticket, monitor how long it takes for the responsible units and collections facets to appear (before and after the implementation).

Dependencies/Blocks:

This ticket neither blocks nor is blocked by another ticket.

Related Github Issues:

362: a slower optimization of the semantic facets, and would impose implementation complexities.

Related links:

These links can consist of resources, bugherds, etc.

Wireframe/Mockup:

N/A

project-lux / lux-marklogic

Semantic facet optimization idea: estimate for all possible facet values #365

362: a slower optimization of the semantic facets, and would impose implementation complexities.