LD4P / qa_server

A rails engine with questioning authority gem installed to serve as an authority search server with normalized results.
Apache License 2.0
5 stars 2 forks source link

Failing Connection Test: LIGATUS_NEW_LD4L_CACHE q=binding #403

Closed elrayle closed 3 years ago

elrayle commented 3 years ago

Browser Access for Test:

https://lookup-int.ld4l.org/authorities/search/linked_data/ligatus_new_ld4l_cache?q=binding&maxRecords=4&context=false

{
  "errors": "Internal Server Error - Search query binding unsuccessful for authority LIGATUS_NEW_LD4L_CACHE"
}

Monitor Status results in UI:

image

CURL access directly to wintermute

Selection of n-triples...

$ curl -L -D - -H 'Accept: application/n-triples' 'http://wintermute.slis.uiowa.edu:8081/ld4l_services/ligatus_batch.jsp?query=binding&maxRecords=4&startRecord=1&lang=en'
HTTP/1.1 200 
Set-Cookie: JSESSIONID=E347B389E42E239F166F9339CCE49BD9;path=/ld4l_services;HttpOnly
Content-Type: application/n-triples;charset=UTF-8
Transfer-Encoding: chunked
Date: Thu, 10 Dec 2020 18:01:38 GMT

<http://w3id.org/lob/concept/1509> <http://vivoweb.org/ontology/core#rank> "1" .
<http://w3id.org/lob/concept/1509> <http://www.w3.org/2004/02/skos/core#inScheme> <http://w3id.org/lob/> .
<http://w3id.org/lob/concept/1509> <http://www.w3.org/2004/02/skos/core#broader> <http://w3id.org/lob/concept/2279> .
<http://w3id.org/lob/concept/1509> <http://www.w3.org/2004/02/skos/core#scopeNote> "Small books bound so as to be conveniently carried in a person&#039;s pocket, usually by means of foredge flaps to protect the bookblock. They vary greatly in the extent to which they were decorated but some binders from the 18th century onwards specialised in this type of work, and described themselves as &#039;pocket book binders&#039;, or even &#039;fancy pocket book binders&#039;.@en"@en .
<http://w3id.org/lob/concept/1509> <http://www.w3.org/2004/02/skos/core#prefLabel> "pocket-book bindings@en"@en .
<http://w3id.org/lob/concept/1509> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2004/02/skos/core#Concept> .
<http://w3id.org/lob/concept/2279> <http://www.w3.org/2004/02/skos/core#prefLabel> "bokbind@nb"@nb .

Running all the n-triples through a validator shows 0 errors.

elrayle commented 3 years ago

ACTION: @elrayle will look to see where in the process on the QA side this fails (e.g. loading of graph vs. creation of json for output)

elrayle commented 3 years ago

The data returned from services.ld4l.org is different from that returned by wintermute.

Results requested with...

curl -L -D - -H 'Accept: application/n-triples' 'http://services.ld4l.org/ld4l_services/ligatus_batch.jsp?query=case%20binding&maxRecords=10&startRecord=1&lang=en'
curl -L -D - -H 'Accept: application/n-triples' 'http://wintermute.slis.uiowa.edu:8081/ld4l_services/ligatus_batch.jsp?query=case%20binding&maxRecords=10&startRecord=1&lang=en'

Examples...

Language tagged literals

services

<http://w3id.org/lob/concept/1195> <http://www.w3.org/2004/02/skos/core#prefLabel> "adhesive-case bindings"@en .

wintermute

<http://w3id.org/lob/concept/1195> <http://www.w3.org/2004/02/skos/core#prefLabel> "adhesive-case bindings@en"@en .

NOTE: The extra @en inside the string and again outside the string in the wintermute version.

Fewer triples returned

Example predicates for <http://w3id.org/lob/concept/1195>

predicate services wintermute comments
prefLabel 1 2 2nd prefLabel in wintermute results is a duplicate
altLabel 4 2 extra 2 altLabels in service are duplicates
narrower 2 1 2nd narrower in service is a duplicate
broader 1 1
inScheme 1 1
scopeNote 1 1
type 1 1
rank 1 1

Different set of results

Results are identified in the table as id (rank)

services wintermute
1195 (3) 1195 (3)
1413 (7) 1413 (2)
1414 (8)
1415 (9)
1665 (6) *
3035 (6)
3061 (10) 3061 (5)
3300 (4)
3796 (1)
4103 (5)
4165 (4) 4165 (1)
4549 (2)
elrayle commented 3 years ago

I confirmed that the problem with the data is that it has an actual line break. The \n is not a problem.

PASSES:

<http://w3id.org/lob/concept/1665> <http://www.w3.org/2004/02/skos/core#altLabel> "label with \n in the middle"@en .

FAILS: with error Expected object (found: "\"label with")

<http://w3id.org/lob/concept/1665> <http://www.w3.org/2004/02/skos/core#altLabel> "label with
\n and an actual new line"@en .
elrayle commented 3 years ago

ACTION: @eichmann will run indexer in debug mode to see if line breaks are escaped. If needed, escaping of line breaks will be re-established.

This triple store is small and easy to repair, but it seems like this would be a problem for any authority that has significant quantities of text (e.g. source notes).

elrayle commented 3 years ago

ACTION: @sfolsom will look for similar examples in LOC data.

sfolsom commented 3 years ago

After scanning a 1,000 source notes in LCNAF, there don't seem to have line breaks: http://services.ld4l.org/fuseki/dataset.html?tab=query&ds=/loc_names#query=PREFIX+madsrdf%3A+%3Chttp%3A%2F%2Fwww.loc.gov%2Fmads%2Frdf%2Fv1%23%3E%0ASelect+%3Fnote%0A%0AWhere+%7B%3Fs+madsrdf%3AhasSource+%3Fp+.%0A++%3Fp+madsrdf%3Acitation-note+%3Fnote+.%0A%0A%0A%7D%0ALimit+1000%0A

Same for the note-like citation source in LCNAF: http://services.ld4l.org/fuseki/dataset.html?tab=query&ds=/loc_names#query=PREFIX+madsrdf%3A+%3Chttp%3A%2F%2Fwww.loc.gov%2Fmads%2Frdf%2Fv1%23%3E%0ASelect+%3Fnote%0A%0AWhere+%7B%3Fs+madsrdf%3AhasSource+%3Fp+.%0A++%3Fp+madsrdf%3Acitation-source+%3Fnote+.%0A%0A%0A%7D%0ALimit+1000%0A.

I'll keep looking.

eichmann commented 3 years ago

Patch in place.

elrayle commented 3 years ago

Confirmed