Closed shelleydoljack closed 1 month ago
If getting JSON data from purl fails or writing to table fails, then add to a failures dict for retry. "failures": [{"druid": "cannot fetch"}, {"druid": "cannot insert into table (missing req'd field)"}]
. Successful writes to table should report out new and updated data, add to XCOM with something like this:
{ "successes":
{
"new": [
{
"fund_name": "ABBOTT",
"druid": "ab123cd4567",
"filename": "image_ab123cd4567.jp2",
"title": "Title",
}
],
"updated": [
{
"fund_name": "ABBOTT",
"druid": "ab123cd4567",
"filename": "image_ab123cd4567.jp2",
"title": "Changed Title",
"reason": "title changed",
}
],
}
}
I asked Andrew if the first in the list of structural.contains.structural.contains
in the public Cocina JSON was equivalent to the XML path contentMetadata/resource[@sequence='1']/file/@id"]
(where we used to get the image filename) and his response was:
Yes, that’s correct. The JSON doesn’t store the sequence number as a specific field but the order in the JSON is equivalent to the sequence.
So we should get the image filename from that part of the JSON.
Blocked by #1175 For each member druid of the digital bookplate collection, lookup the public JSON document at
purl.stanford.edu
and parse out the fields we will store in our bookplates table. This Bookplates class from our old process should be used as a guide to the fields we need from the JSON. The output should be sending parsed data to a task that will store data in the bookplates table. If the public JSON document does not contain all the fields necessary for us to create 979's (like missing image filename for instance), pass this along via XCOM to be reported in an email.Example JSON for the ABBOTT bookplate. The fields we need: fund name
description.identifier.value
where displayLabel="Symphony Fund Name"druid
externalIdentifier
field:"externalIdentifier": "druid:ws066yy0421",
image filename This appears in thedescription.identifiers
list and thestructural.contains
list. Per Andrew,From our previous process, we used the
<contentMetadata>
to get to the image filename. We should pull from the JSON equivalent, so from thestructural.contains.structural.contains
list:I think we will always take the first in those list. I will confirm.