NationalMuseumAustralia / Collection-API

The public web API of the National Museum of Australia
10 stars 0 forks source link

Incomplete records in API #84

Closed SimmoK closed 5 years ago

SimmoK commented 5 years ago

I've noticed some incomplete records in the API like https://data.nma.gov.au/object/45666. Seems to missing most of the info. Appears the same in API Explorer. See json response and the source EMu XMl for the record attached

4566emu 45666api

SimmoK commented 5 years ago

PLs ignore i note the restricted now! Will close

SimmoK commented 5 years ago

Though will this be excluded from public search results?

staplegun commented 5 years ago

Records are OK to appear in the public API if they contain AcsAPI of Public or Public Restricted (the image is removed if AcsCCStatus isn't a CC or Public Domain).

So this record should be in the public API, but not sure why the other fields are missing (description, materials, parties, measurements, etc.). Maybe it accidentally got caught in the redaction process.

Conal-Tuohy commented 5 years ago

Check the SPARQL DESCRIBE query result to see if the data's there

staplegun commented 5 years ago

The full record is stored correctly in Fuseki DESCRIBE <http://data.nma.gov.au/object/45666#> - including production, materials, image, etc. This object record does NOT have a Piction image (just the above EMu image).

However the record is these fields in the internal API in both simple and JSON-LD formats. https://data.nma.gov.au/object/45666?apiKey=XXX&format=json-ld https://data.nma.gov.au/object/45666?apiKey=XXX&format=simple (The public API should at least include materials, etc. anyway.)

So it would appear to be an issue in the trix redaction step.

Conal-Tuohy commented 5 years ago

The bug was indeed in the redaction step. We were trying to slim down the description of objects which were aggregated in a narrative, leaving behind only a few key properties of those objects. However, when the main resource being described was itself an object, and it was part of a narrative, then that object's properties were being cut down, too. In reality we only ever wanted to redact the properties of objects if they were secondary objects which were just part of a narrative, and we never wanted to redact the properties of the primary object itself.

Fixed in https://github.com/NationalMuseumAustralia/Collection-API-ETL/commit/9bc498d62ccd1ffc30e0826510637cdaae780e43

staplegun commented 5 years ago

Works in production now

f27wood commented 5 years ago

Example record tested AOK in prod, waiting for full ETL load to check other records.

f27wood commented 5 years ago

All seems AOK in prod, is there is a way to automate this testing to confirm?

f27wood commented 5 years ago

As discussed, closing this as confident it has been fixed for objects with narratives,.