project-lux / lux-marklogic

Code, issues, and resources related to LUX MarkLogic
Other
3 stars 2 forks source link

Research sorting search results by linked data (from 559) #10

Open gigamorph opened 4 months ago

gigamorph commented 4 months ago

Problem Description: This is the backend Ticket to support the implementation of https://github.com/project-lux/lux-frontend/issues/32 . The use case is sorting by Artists Name

Expected Behavior/Solution:

TBD (research results)

Question: Is this blocked by or dependent on https://github.com/project-lux/lux-frontend/issues/80?

Requirements:

UAT/LUX Examples:

image

Related Github Issues:

brent-hartwig commented 2 months ago

@clarkepeterf, you and I both proved out an approach that puts the generated CTS search (w/ or w/o semantic search criteria) in Optic's op.fromSearch then uses the likes of op.fromTriples or op.fromSPARQL to select a column containing data from related documents in order to then sort on it. You took this a step further and began to add support for this within the search endpoint. I believe that work was paused as priorities changed. When this becomes a priority again, would the first step be to bring over the edits from https://git.yale.edu/lux-its/marklogic/tree/feature/599-sort-by-linked-data? I suggest testing include comparing the performance of a few unsorted and sorted searches as time added by sort will count against search's timeout setting.

cc: @prowns, @jffcamp, @roamye

jffcamp commented 1 month ago

@brent-hartwig - should researching this more wait until we are working with 11.2? cc: @prowns

brent-hartwig commented 1 month ago

@jffcamp, I think we're good with how this can be implemented now and in 11.2. I think the question is: do we want to proceed with that means before the CTS/Optic comparison effort is complete and before an associated optimization becomes available? Yes would mean we could deliver a solution sooner. We do not know when the search API comparison will end and the associated optimization is slated for 11.3 --here too, a timeline is not known. cc: @prowns, @clarkepeterf

roamye commented 1 week ago

this is on deck before feb 22 - but has no milestone?

Is there a response for Brent's question above?

@prowns @clarkepeterf @jffcamp

clarkepeterf commented 6 days ago

@brent-hartwig do you have any updates on the status of the CTS/Optic Comparison or ML 11.3?

@jffcamp do we want to wait for the CTS/Optic comparison or ML 11.3 (Optic optimization)?

We don't have to wait - we can implement this ticket based on the changes in https://git.yale.edu/lux-its/marklogic/tree/feature/599-sort-by-linked-data. But a CTS/Optic comparison or Optic optimization may lead us to further changes to our search code that would impact this ticket.

This sorting is likely to be slow sometimes - we should do some testing to see what kind of impact this kind of sorting has on response times

@prowns @roamye

brent-hartwig commented 6 days ago

Brent, do you have any updates on the status of the CTS/Optic Comparison or ML 11.3?

@clarkepeterf, there has been progress on both but I do not anticipate our team deciding whether we can switch to Optic for at least another month. Query pair batch nos. 1 and 2 are wrapping up. I anticipate updating and sharing an analysis doc within two weeks. In parallel, I am to start batch no. 3 with focal points on false positives we may expect due to unfiltered results, large indexes, and sort. Engineering will not be able to look at that batch before July. I don't have an anticipated 11.3.0 GA release date but believe it will be in July yet definitive plans should not be based on my speculation. As with at least one other ticket, I believe this boils down to balancing when to provide this functionality with a willingness to reimplement.

cc: @jffcamp, @prowns, @roamye

roamye commented 5 days ago

Unrelated and maybe this was discussed before - but is this ticket blocked by/dependent on https://github.com/project-lux/lux-frontend/issues/80 ? This is listed as a question above with no answer.

80 is still in forming and it is unclear if it should be prepped before July. (depending if 80 blocks this ticket)

@brent-hartwig @clarkepeterf @jffcamp

brent-hartwig commented 5 days ago

@roamye, this backend ticket is about figuring out a way to implement the ability to sort by data in documents that are related to the search result documents. We have figured out a relatively low-effort means that we could implement before the Optic/CTS decision. The only disruptive requirement https://github.com/project-lux/lux-frontend/issues/80 could come up with is sorting by data that is unrelated to the search result documents.