Closed gigamorph closed 3 months ago
This is the issue that was discusssed as going to @brent-hartwig
vote to prioritize this if at all possible, thanks!
@jffcamp, @prowns, @azaroth42, @kkdavis14, and @clarkepeterf, three implementation choices come to mind:
/json/identified_by/content[../assigned_by/motivated_by/id]
for each unique motivate by ID. The search request would then need to provide enough information that the backend could figure out which range index to use. I'm not sure the preceding XPath is supported for indexing content, meaning a data change might be required to support this option.I've been wondering why we're accounting for multiple archival sorting numbers, when I don't see that possibility in this use case (i.e. why in no.2 you would need to index each unique motivated_by ID). I checked the data and for the 2.2 million recs with this pattern, there is only ever one assigned_by/motivated_by per rec. Because, any thing in an archive is only in it's one hierarchy, it can't be in multiple (e.g. Peter's example from the original ticket can't ever occur, for Archival things).
In current data only Archival things are getting a sorting identifier and they only ever have one motivating assignor at a time (the thing that's directly above them--
https://linked-art.library.yale.edu/node/d899a9c6-d814-4058-9b58-6ef7c68b536f has sorting id assigned_by Magazine advertisements for cigarettes by brand https://linked-art.library.yale.edu/node/342d6b38-fc7d-4683-a076-dbec43ad5e73 has sorting id assigned_by Series I: Regular Size https://linked-art.library.yale.edu/node/849284f0-9d36-4d6e-9723-0d296559202c has sorting id assigned_by William Van Duyn Tobacco Advertisement Collection https://linked-art.library.yale.edu/node/523cebc7-5a18-4fa9-ae65-3c6ec6c72048
@azaroth42 Is this to account for some future where something else would be leveraging this pattern? Right now it's solving a need that doesn't exist.
@kkdavis14 - RS mentioned this use case in different convo about this topic yesterday: The only time it would ever matter is if the same thing were in two different sets for the purposes of sorting by different sort identifiers- you'd need to know which one to use. Which is possible in the future ... e.g. sort within archive vs sort within personal collection vs sort within exhibition ... but for today, only archives need the explicit sort id
ok, I would just want to not delay fixing this for the existing use case while we figure out how it works for future use cases we don't have yet. I don't know if that's the situation or not.
that being said, to answer no.3 of Brent's question of "how many search results", with the theoretical situations from Sarah's comment, the answer could be infinite.
No longer blocked since we have a SOW from @brent-hartwig
The following was written when under the impression an item could be part of multiple archives rather than multiple collections --a hierarchy of collections within a single archive.
Two approaches have been functionally proven out.
Opening notes:
Via XPath, custom JavaScript is able to retrieve the archive-specific value to sort by. This approach requires the documents to be pulled from disk; however, LUX's CTS search implementation already does so. As such, the additional time may be limited to a) executing the XPath in each search result, b) sorting those values, and c) selecting the subset of results per pagination parameters.
This approach is not expected to scale as well as the Optic approach; however, the archive with the most items directly associated to it has about 6,000 items. While the system was otherwise quiet, a search for all items in this archive plus sorting by this archive's sort values took just under 1.5 seconds.
LUX implements search via CTS; however, the generated CTS query can be given to Optic's op.fromSearch
. Combined with one additional triple per item and archive pair, the results can be sorted by the search result's sort value for a specified archive.
The code requires triples whereby the subject is the search result IRI/URI, the predicate is the archive ID, and the object is the value to sort by. The predicate may justifiably be questioned, potentially resulting in a different triple pattern(s).
This approach was only functionally tested. If this approach is pursued, it will need to be tested at scale after the triples are added to the full dataset.
Optic search results are not filtered, which can lead to false positives when the search criteria cannot be resolved via indexes alone. An example is punctuation. While LUX's CTS search implementation presently filters search results, its unfiltered contexts (estimates and facets) have the same limitations.
Research complete. Team decided #90 is the next step for this.
Problem Description: Currently we cannot sort by sort_id which matches a specific set of ID's.
Per this teams thread an object can have a sort key for each set it's in. Figure out how to sort by [SORT_ID which matches specific set ID]
Expected Behavior/Solution: To sort by sort_id that matches a specific set of ID's
Requirements:
Needed for promotion: If an item on the list is not needed, it should be crossed off but not removed.
- [ ] Wireframe/Mockup - MikeUAT/LUX Examples:
Dependencies/Blocks:
Related Github Issues: resources:
Related links:
Wireframe/Mockup: Place wireframe/mockup for the proposed solution at end of ticket.