project-lux / lux-marklogic

Code, issues, and resources related to LUX MarkLogic
Other
3 stars 2 forks source link

Fix Returning Dates For Facets (from 1102) #43

Open gigamorph opened 4 months ago

gigamorph commented 4 months ago

Current state: If you use date facets, it's possible to get a list of years that ends earlier. Facets only returns start dates.

Expected behavior: return start dates and end dates for improved accuracy in results

Note:

Will also need to add new index field for all dates, right now they only have begin string, not end string

Dependency: THIS TICKET MUST BE DEPLOYED WITH #https://github.com/project-lux/lux-frontend/issues/54

Old tix of 54: https://git.yale.edu/lux-its/lux-web/issues/1878

Next steps

brent-hartwig commented 3 months ago

This Ticket

Facets only returns start dates.

Very true. We could address this by:

  1. Identify the end dates and ensure their string values are indexed.
  2. Modify facetConfig.mjs to support multiple indexes per facet, whether as a single or second property. In other contexts, we renamed the indexReference string property to the indexReferences string array property yet the next item may encourage we go with a second property instead.
  3. If necessary, update generateRemainingSearchTerms.mjs' use of facetConfig.mjs.
  4. Update facetLib.mjs' call to cts.fieldValues to use all indexes associated with a facet.

Optimization

Whether as part of this ticket's scope or another...

Thinking about the above has led to an optimization idea as well. If the frontend only needs the first and last dates, the facets endpoint could offer a parameter saying as much, yet not be limited to dates. The following illustrates what the implementation could look like where q would be the search query.

const q = cts.documentQuery([
  'https://lux.collections.yale.edu/data/person/098de228-41f2-404e-83e2-e48b2bd632f8'
]);
const fields = ['agentBornStartDateStr', 'agentDiedStartDateStr'];
const baseOptions = ['limit=1', 'score-zero', 'lazy'];
const results = {
  earliest: cts.fieldValues(fields, null, baseOptions.concat(['ascending']), q),
  latest: cts.fieldValues(fields, null, baseOptions.concat(['descending']), q),
};
results;

If we didn't want the endpoint consumer to control this behavior, we could extend the facet configuration. My preference is a new endpoint parameter that defaults to returning the first and last values for date facets. Here is how the remaining search terms generator already identifies date facets: const isDate = facetIndexReference.endsWith('DateStr'). If we add a parameter, it should be documented in the API usage doc.

I would expect the facet to be calculated faster and, for some searches, a much smaller response size.

cc: @clarkepeterf