I implemented an initial version of the sort feature that used to exist in the database.
I get the warning:
RequestError(400, 'search_phase_execution_exception', 'Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [resource.title] in order to load field data by uninverting the inverted index. Note that this can use significant memory.')
Probably the most efficient way to solve this is to also store a "sort" version of the fields we want to sort on with a limited length. This field would be of keyword type rather than text (since the text has the problem of high processing usage).
My initial proposal is that we only support sorting on the 6 or so columns that the search actually displays for sorting column headings. (Right now the backend code for the DB allows sorting on a huge number of fields like 15-20, which I think is overkill.)
[x] Title
[x] Creator
[x] Identifier
[x] Object Publisher
[x] Publication date (already works)
[x] Object type
This would mean adding some special fields for these with limited lengths (how many degrees of precision do we really need for sort? 10-50 characters maybe? and that would limit the growth of the index too much).
It seems like the "manage IDs" presents a couple more options for columns that could be sortable. I believe these are the extra 4.
[x] ID owner
[x] Created Date
[x] ID date last modified (??? what is this field, not sure we have it in the OpenSearch data?)
[x] ID Status
(I also believe the date-style fields do not need additional indexing.)
I implemented an initial version of the sort feature that used to exist in the database.
I get the warning:
Probably the most efficient way to solve this is to also store a "sort" version of the fields we want to sort on with a limited length. This field would be of keyword type rather than text (since the text has the problem of high processing usage).
My initial proposal is that we only support sorting on the 6 or so columns that the search actually displays for sorting column headings. (Right now the backend code for the DB allows sorting on a huge number of fields like 15-20, which I think is overkill.)
This would mean adding some special fields for these with limited lengths (how many degrees of precision do we really need for sort? 10-50 characters maybe? and that would limit the growth of the index too much).
It seems like the "manage IDs" presents a couple more options for columns that could be sortable. I believe these are the extra 4.
(I also believe the date-style fields do not need additional indexing.)