Open roamye opened 7 months ago
[!NOTE] The optimization identified herein was invalidated for this query (and YCBA's IRI was accidentally used twice). For more info, see this comment.
@roamye and @clarkepeterf,
Two responses:
https://lux.collections.yale.edu/data/
.{
"AND":[
{
"produced":{
"memberOf":{
"curatedBy":{
"memberOf":{
"id":"group/0e8bc04a-6538-4792-b82a-9e0751857e7d"
}
}
}
}
},
{
"produced":{
"memberOf":{
"curatedBy":{
"memberOf":{
"id":"group/6086b58d-941d-41e3-87c1-e00e96952ffb"
}
}
}
}
}
]
}
Here's my first re-write off the query. Near-instance response of zero results:
{
"produced":{
"memberOf":{
"curatedBy":{
"memberOf":{
"AND":[
{
"id":"https://lux.collections.yale.edu/data/group/0e8bc04a-6538-4792-b82a-9e0751857e7d"
},
{
"id":"https://lux.collections.yale.edu/data/group/6086b58d-941d-41e3-87c1-e00e96952ffb"
}
]
}
}
}
}
}
However, when I take the AND up a level, I quickly get the first 10 of an estimated 5,217 results:
{
"produced":{
"memberOf":{
"curatedBy":{
"AND":[
{
"memberOf":{
"id":"https://lux.collections.yale.edu/data/group/0e8bc04a-6538-4792-b82a-9e0751857e7d"
}
},
{
"memberOf":{
"id":"https://lux.collections.yale.edu/data/group/0e8bc04a-6538-4792-b82a-9e0751857e7d"
}
}
]
}
}
}
}
So maybe the search makes sense after all and we identified another optimization: push AND and OR down to the extent the triple paths / search terms match.
@brent-hartwig
cc: @prowns
It probably was and the user just didn't include the full URI/IRI. It was probably one of us! If this was commonplace, the UI could provide a visual cue when the provided ID doesn't resolve and/or the backend could replace the associated CTS query containing an invalid ID with cts.falseQuery
, thereby avoiding searching the triples store for something that will never exist.
@roamye, do we want to close this under the suspicion it was user error?
Likely, but first I am validating my search result with YCBA: https://lux-front-tst.collections.yale.edu/view/results/people?q=%7B%22AND%22%3A%5B%7B%22produced%22%3A%7B%22memberOf%22%3A%7B%22curatedBy%22%3A%7B%22memberOf%22%3A%7B%22id%22%3A%22https%3A%2F%2Flux.collections.yale.edu%2Fdata%2Fgroup%2F0e8bc04a-6538-4792-b82a-9e0751857e7d%22%7D%7D%7D%7D%7D%5D%7D
Closing - works like a charm now. Here it is with bibliographic and rare book materials as well.
@prowns, I just tried your link from 8 Aug and the query timed out. It is different than the one reported in the description.
Agents that produced objects found in sets curated by YCBA or YUAG. IRIs updated to be compatible with the 2024-09-04 database.
This is the query that may benefit from the optimization idea identified on 7 May; although, I see I used YCBA's ID twice back then.
{
"AND":[
{
"produced":{
"memberOf":{
"curatedBy":{
"memberOf":{
"id":"https://lux.collections.yale.edu/data/group/0e8bc04a-6538-4792-b82a-9e0751857e7d"
}
}
}
}
},
{
"produced":{
"memberOf":{
"curatedBy":{
"memberOf":{
"id":"https://lux.collections.yale.edu/data/group/41310ca5-8137-45fe-ac2c-a6a04e2235f1"
}
}
}
}
}
]
}
Agents that either created works about objects or produced objects that are curated by YCBA.
This is the query that resulted in a v8 engine crash, which is described more in the following comment.
While memberOf.curatedBy.memberOf.id
repeats, I didn't witness an optimization by only executing that code once and plugging it into both spots: ticket-131-8-Aug-example-optimization-attempt.js.txt
{
"OR":[
{
"created":{
"carriedBy":{
"memberOf":{
"curatedBy":{
"memberOf":{
"id":"https://lux.collections.yale.edu/data/group/0e8bc04a-6538-4792-b82a-9e0751857e7d"
}
}
}
}
}
},
{
"produced":{
"memberOf":{
"curatedBy":{
"memberOf":{
"id":"https://lux.collections.yale.edu/data/group/0e8bc04a-6538-4792-b82a-9e0751857e7d"
}
}
}
}
}
]
}
@prowns, I'm reopening this ticket as this search also crashes the v8 engine --at least with the concurrent facet requests that are only asking for the first 20 values ...of a search that times out after 20 seconds. What follows are some log excerpts along with a zip of the logs.
The crash occurred on node 217. Opening lines:
2024-09-18 12:09:01.468 Info: Memory 66% phys=127449 size=90542(71%) rss=41052(32%) huge=43904(34%) anon=31635(24%) file=14517(11%) forest=17299(13%) cache=44755(35%) registry=1(0%) join=1738(1%)
2024-09-18 12:09:09.021 Debug: v8 fatal error - location: v8::Object::SetAlignedPointerInInternalField(), message: Internal field out of bounds
2024-09-18 12:09:09.022 Debug: v8 fatal error - location: v8::Object::SetAlignedPointerInInternalField(), message: Internal field out of bounds
2024-09-18 12:09:09.022 Debug: v8 fatal error - location: v8::Object::SetAlignedPointerInInternalField(), message: Internal field out of bounds
2024-09-18 12:09:09.023 Debug: v8 fatal error - location: v8::Object::SetAlignedPointerInInternalField(), message: Internal field out of bounds
2024-09-18 12:09:24.704 Critical: Segmentation fault in thread 0x7efbbf0f5700 addr 0xfffffffffffffff8
2024-09-18 12:09:24.704 Critical:+Thread 117 (Thread 0x7efd01dff700 (LWP 27330)):
2024-09-18 12:09:24.704 Critical:+#0 0x00007f09ab23edfb in nanosleep () from /lib64/libpthread.so.0
2024-09-18 12:09:24.704 Critical:+#1 0x0000555678386c59 in svc::Thread::sleep () at Thread.cpp:778
2024-09-18 12:09:24.704 Critical:+#2 0x0000555675a7209b in xdmp::ForestRebalancerThread::init () at Forest.cpp:519
2024-09-18 12:09:24.704 Critical:+#3 0x0000555675b9bf83 in xdmp::ForestRebalancerThread::run () at Forest.cpp:656
2024-09-18 12:09:24.704 Critical:+#4 0x0000555678387d11 in svc::Thread::top () at Thread.cpp:384
2024-09-18 12:09:24.704 Critical:+#5 0x0000555678388769 in runThread () at Thread.cpp:421
2024-09-18 12:09:24.704 Critical:+#6 0x00007f09ab23544b in start_thread () from /lib64/libpthread.so.0
2024-09-18 12:09:24.704 Critical:+#7 0x00007f09aa5fb52f in clone () from /lib64/libc.so.6
2024-09-18 12:09:24.704 Critical:+Thread 116 (Thread 0x7efbd1279700 (LWP 27305)):
2024-09-18 12:09:24.704 Critical:+#0 0x00007f09ab23da46 in do_futex_wait.constprop () from /lib64/libpthread.so.0
...
The v8 crash captured by https://git.yale.edu/lux-its/ml-cluster-formation/issues/52 (and duplicate #302) also includes "Internal field out of bounds"
Here are the failed facet requests. There were no successful ones. One of the five was also on node 217.
20240918-blue-as-tst-node-111-8003-ErrorLog-trimmed.txt:2024-09-18 12:09:08.227 Info: [Event:id=LuxFacets] Failed to calculate the following facet after 19591 milliseconds: agentRecordType (page: 1; pageLength: 20; filterResults: n/a)
20240918-blue-as-tst-node-111-8003-ErrorLog-trimmed.txt:2024-09-18 12:09:08.229 Info: [Event:id=LuxFacets] Failed to calculate the following facet after 19562 milliseconds: agentStartPlaceId (page: 1; pageLength: 20; filterResults: n/a)
20240918-blue-as-tst-node-111-8003-ErrorLog-trimmed.txt:2024-09-18 12:09:08.231 Info: [Event:id=LuxFacets] Failed to calculate the following facet after 19566 milliseconds: agentActivePlaceId (page: 1; pageLength: 20; filterResults: n/a)
20240918-blue-as-tst-node-111-8003-ErrorLog-trimmed.txt:2024-09-18 12:09:08.264 Info: [Event:id=LuxFacets] Failed to calculate the following facet after 19600 milliseconds: agentHasDigitalImage (page: 1; pageLength: 20; filterResults: n/a)
20240918-blue-as-tst-node-217-8003-ErrorLog-trimmed.txt:2024-09-18 12:09:08.164 Info: [Event:id=LuxFacets] Failed to calculate the following facet after 19503 milliseconds: agentMemberOfId (page: 1; pageLength: 20; filterResults: n/a)
Logs through ~12:20p server time: 20240918-1220-Wed.zip
Optimization debunked, at least for UAT example found in the description. The query is for agents that have at least one object curated by YCBA and another by YUAG --not for agents that have objects co-curated by YCBA and YUAG. To correctly resolve the criteria, we need to get two lists of agents (via produced.memberOf.curatedBy.memberOf.[unit]
) and find their intersect. The triple path requires three semantic hops and is being traversed twice. The CTS version takes about 3 seconds. Including as query 14 in the Optic/CTS comparison.
As for the 8 Aug example that times out after 20 seconds and contributed to a v8 engine crash, it is being included as query 15 in the Optic/CTS comparison. Need @jffcamp or @prowns to weigh in on whether there should be additional follow-up for this particular v8 engine crash scenario. The crash log messages include "Internal field out of bounds". The only other ticket I found with that message is ML 304 --> CF 52 --> Support 37229, which is an active support ticket @clarkepeterf and @xinjianguo are working.
@prowns @jffcamp - was a discussion had on the follow up procedure for this v8 engine crash scenario?
should it be brought to the it team meeting?
Problem Description: In advanced search, a search query was done where one of the expected results should be "Trumbull & Wiley". Instead of the expected results the query generated the "Your search yielded no results. Please try another search." message. If we change the query to go down to the departmental level the search results work. However, the results should also work at the non-departmental level.
This ticket serves as a research ticket to understand why this is happening and how we can fix it.
Expected Behavior/Solution: TBD - research
Requirements: TBD - research
Needed for promotion: If an item on the list is not needed, it should be crossed off but not removed.
UAT/LUX Examples:
Dependencies/Blocks:
Related Github Issues:
Related links:
Wireframe/Mockup: Place wireframe/mockup for the proposed solution at end of ticket.