project-lux / lux-marklogic

Code, issues, and resources related to LUX MarkLogic
Other
3 stars 2 forks source link

Avoid IRI lexicon join when Hop Inverse is directly within Hop with Field #304

Open brent-hartwig opened 2 months ago

brent-hartwig commented 2 months ago

Problem Description: On 27 Aug 24, a particular search led to a v8 engine crash. This incident is being tracked by MarkLogic Support ticket no. 37229 and https://git.yale.edu/lux-its/ml-cluster-formation/issues/52. This ticket is tracking an optimization found while investigating the crash. The optimization may also take stress off of the v8 engine.

Original search criteria:

{
  "_scope":"event",
  "used":{
    "containingItem":{
      "memberOf":{
        "name":"peabody"
      }
    }
  }
}

It is the used.containingItem portion of the query that is this ticket's focus. The search pattern of used is Hop with Field. The search pattern of containingItem is Hop Inverse. Hop with Field needs IRIs, and uses the IRI lexicon to get them. Yet, the Hop Inverse pattern is able to serve up IRIs, rendering Hop with Field's IRI lexicon join a waste of time when the input to Hop with Field is the output of Hop Inverse.

Here's an excerpt of the above search criteria's query, broken down into bits where the left side is the pre-optimized version and the right is the post-optimized version:

image

Both versions return the same results, whether filtered or not.

This optimization was the second idea. The first idea was to add a type constraint on the call cts.triples. I thought it might help as la('member_of') can be defined in Items and Sets. It didn't help in terms of performance. Unknown if it would help on the v8 front.

All versions of the full query: query-versions-for-ticket-304.zip

Expected Behavior/Solution: Implement the above-described optimization. Receive the same search results, but faster. Unknown if this optimization scales linearly but should help all that have a Hop Inverse term as a direct child of a Hop with Field term.

Scope includes determining if the optimization reduces the ability to reproduce the v8 engine crash, and reporting back to https://git.yale.edu/lux-its/ml-cluster-formation/issues/52.

Requirements: Nothing additional to add here.

Needed for promotion: If an item on the list is not needed, it should be crossed off but not removed.

~- [ ] Wireframe/Mockup - Mike~

UAT/LUX Examples:

Dependencies/Blocks:

This issue neither blocks nor is blocked by another ticket.

Related Github Issues:

Related links:

Wireframe/Mockup:

N/A

brent-hartwig commented 2 months ago

@roamye, I believe this ticket is ready for prioritization. I associated it to the 9 Sep milestone and provided a default assignee.

roamye commented 2 months ago

@brent-hartwig - thanks Brent! I moved it to the 09-23 milestone as the next UAT 09/12 will happen after the 09-09 milestone is deployed.

roamye commented 2 weeks ago

from team meeting 10/25:

from team meeting 11/01: input from @brent-hartwig and @clarkepeterf is needed.