NationalMuseumAustralia / Collection-API

The public web API of the National Museum of Australia
10 stars 0 forks source link

Solr not converting XML to JSON during load #76

Closed staplegun closed 5 years ago

staplegun commented 5 years ago

When loading object record 228418 into Solr, the raw XML is being loaded into the simple field instead of being converted to JSON (the conversion into the json-ld field is fine).

The issue may be the source field which is showing an empty namespace.

http://13.54.240.226/object/228418?apikey=...

{"data": [

<map xmlns="http://www.w3.org/2005/xpath-functions"
     xmlns:xmljson="tag:conaltuohy.com,2018:nma/xml-to-json"
     xmlns:c="http://www.w3.org/ns/xproc-step"
     xmlns:path="tag:conaltuohy.com,2018:nma/trix-path-traversal"
     xmlns:f="http://www.w3.org/2005/xpath-functions"
     xmlns:map="http://www.w3.org/2005/xpath-functions/map"
     xmlns:trix="http://www.w3.org/2004/03/trix/trix-1/">
   <string key="id">228418</string>
   <string key="type">object</string>
   <array key="additionalType">
      <string>Canoes</string>
   </array>
   <string key="title">Two dowel rods from the double outrigger canoe</string>
   <string key="identifier">IL 2011/0033.0001.001</string>
   <string key="physicalDescription">Two pieces of wooden dowel. One has a blackened end and is shorter than the other.</string>
   <string xmlns="" key="source">Inward loan</string>
   <map key="_meta">
      <string key="modified">2018-06-18</string>
      <string key="hasFormat">http://collectionsearch.nma.gov.au/object/228418</string>
   </map>
</map>
]}
Conal-Tuohy commented 5 years ago

That'll be it

staplegun commented 5 years ago

Actually, this may be a wider issue with the internal API, e.g. http://nma-dev.conaltuohy.com/narrative/?text=defining&apikey=... also returns XML.

Conal-Tuohy commented 5 years ago

Yes the fallback if the JSON-XML is invalid is to just store it as is. This is just to facilitate debugging. But the root cause is presumably in the stylesheet that generates the JSON XML

staplegun commented 5 years ago

Solved the odd search results - using /? confuses the API. It sees the slash and assumes it is a request for a single object, so it lumps all the search results as a single thing without commas

Added as a separate issue #78

Confirmed that the record containing inward loan field is an ingest issue as the data in Solr in the simple field should be JSON but it is un-transformed XML:

http://nma-dev.conaltuohy.com/solr/core_nma_internal/select?q=id:object/228418&wt=xml

<response>
<result name="response" numFound="1" start="0">
<doc>
<str name="id">object/228418</str>
...
<arr name="simple">
<str>
<map xmlns="http://www.w3.org/2005/xpath-functions" xmlns:xmljson="tag:conaltuohy.com,2018:nma/xml-to-json" xmlns:c="http://www.w3.org/ns/xproc-step" xmlns:path="tag:conaltuohy.com,2018:nma/trix-path-traversal" xmlns:f="http://www.w3.org/2005/xpath-functions" xmlns:map="http://www.w3.org/2005/xpath-functions/map" xmlns:trix="http://www.w3.org/2004/03/trix/trix-1/"> <string key="id">228418</string> <string key="type">object</string> <array key="additionalType"> <string>Canoes</string> </array> <string key="title">Two dowel rods from the double outrigger canoe</string> <string key="identifier">IL 2011/0033.0001.001</string> <string key="physicalDescription">Two pieces of wooden dowel. One has a blackened end and is shorter than the other.</string> <string xmlns="" key="source">Inward loan</string> <map key="_meta"> <string key="modified">2018-06-18</string> <string key="hasFormat">http://collectionsearch.nma.gov.au/object/228418</string> </map> </map>
</str>
staplegun commented 5 years ago

Closing as the conversion to JSON is fixed, e.g:

The usage of /? is still broken (raised in #78).