Closed elrayle closed 3 years ago
ACTION: @elrayle will look to see where in the process on the QA side this fails (e.g. loading of graph vs. creation of json for output)
The data returned from services.ld4l.org
is different from that returned by wintermute
.
Results requested with...
curl -L -D - -H 'Accept: application/n-triples' 'http://services.ld4l.org/ld4l_services/ligatus_batch.jsp?query=case%20binding&maxRecords=10&startRecord=1&lang=en'
curl -L -D - -H 'Accept: application/n-triples' 'http://wintermute.slis.uiowa.edu:8081/ld4l_services/ligatus_batch.jsp?query=case%20binding&maxRecords=10&startRecord=1&lang=en'
Examples...
services
<http://w3id.org/lob/concept/1195> <http://www.w3.org/2004/02/skos/core#prefLabel> "adhesive-case bindings"@en .
wintermute
<http://w3id.org/lob/concept/1195> <http://www.w3.org/2004/02/skos/core#prefLabel> "adhesive-case bindings@en"@en .
NOTE: The extra @en
inside the string and again outside the string in the wintermute version.
Example predicates for <http://w3id.org/lob/concept/1195>
predicate | services | wintermute | comments |
---|---|---|---|
prefLabel | 1 | 2 | 2nd prefLabel in wintermute results is a duplicate |
altLabel | 4 | 2 | extra 2 altLabels in service are duplicates |
narrower | 2 | 1 | 2nd narrower in service is a duplicate |
broader | 1 | 1 | |
inScheme | 1 | 1 | |
scopeNote | 1 | 1 | |
type | 1 | 1 | |
rank | 1 | 1 |
Results are identified in the table as id (rank)
services | wintermute |
---|---|
1195 (3) | 1195 (3) |
1413 (7) | 1413 (2) |
1414 (8) | |
1415 (9) | |
1665 (6) * | |
3035 (6) | |
3061 (10) | 3061 (5) |
3300 (4) | |
3796 (1) | |
4103 (5) | |
4165 (4) | 4165 (1) |
4549 (2) |
I confirmed that the problem with the data is that it has an actual line break. The \n
is not a problem.
PASSES:
<http://w3id.org/lob/concept/1665> <http://www.w3.org/2004/02/skos/core#altLabel> "label with \n in the middle"@en .
FAILS: with error Expected object (found: "\"label with")
<http://w3id.org/lob/concept/1665> <http://www.w3.org/2004/02/skos/core#altLabel> "label with
\n and an actual new line"@en .
ACTION: @eichmann will run indexer in debug mode to see if line breaks are escaped. If needed, escaping of line breaks will be re-established.
This triple store is small and easy to repair, but it seems like this would be a problem for any authority that has significant quantities of text (e.g. source notes).
ACTION: @sfolsom will look for similar examples in LOC data.
After scanning a 1,000 source notes in LCNAF, there don't seem to have line breaks: http://services.ld4l.org/fuseki/dataset.html?tab=query&ds=/loc_names#query=PREFIX+madsrdf%3A+%3Chttp%3A%2F%2Fwww.loc.gov%2Fmads%2Frdf%2Fv1%23%3E%0ASelect+%3Fnote%0A%0AWhere+%7B%3Fs+madsrdf%3AhasSource+%3Fp+.%0A++%3Fp+madsrdf%3Acitation-note+%3Fnote+.%0A%0A%0A%7D%0ALimit+1000%0A
Same for the note-like citation source in LCNAF: http://services.ld4l.org/fuseki/dataset.html?tab=query&ds=/loc_names#query=PREFIX+madsrdf%3A+%3Chttp%3A%2F%2Fwww.loc.gov%2Fmads%2Frdf%2Fv1%23%3E%0ASelect+%3Fnote%0A%0AWhere+%7B%3Fs+madsrdf%3AhasSource+%3Fp+.%0A++%3Fp+madsrdf%3Acitation-source+%3Fnote+.%0A%0A%0A%7D%0ALimit+1000%0A.
I'll keep looking.
Patch in place.
Confirmed
Browser Access for Test:
https://lookup-int.ld4l.org/authorities/search/linked_data/ligatus_new_ld4l_cache?q=binding&maxRecords=4&context=false
Monitor Status results in UI:
CURL access directly to wintermute
Selection of n-triples...
Running all the n-triples through a validator shows 0 errors.