Orbis-Cascade-Alliance / harvester

XForms-based OAI-PMH harvester for Orbis Cascade. Metadata are transformed into RDF and posted into a triplestore for access from finding aids.
9 stars 1 forks source link

Reed thumbnails #93

Closed jallibunn closed 7 years ago

jallibunn commented 7 years ago

I am not seeing thumbnails for Reed in any of the Primo versions of their sets. They came late on the scene with a brand-new DAM ; email threads show that we minted their system URL in the harvester, but something is not happening with the thumbnails.

They have three sets in: Thomas Lamb Eliot Papers Antiquarian Maps Nicholas Wheeler Physics Lectures (this set is Primo only) @cwyant-alliance

ewg118 commented 7 years ago

ORPR is listed as an "other" DAMS in the config, but there's no XSLT in the OAI-PMH->RDF to handle their image URLs.

jallibunn commented 7 years ago

OK, what do you need from Reed to complete the config? @cwyant-alliance

ewg118 commented 7 years ago

Here's what I've found: the dc:description contains a link to the thumbnail, e.g., https://rdc.reed.edu/v1/resources/0bd37d70-07da-40f4-8412-1855738352f9/thumb/128.jpg

If you crop off the part after /thumb, you'll get the master TIF file. If you append .jpg, you'll get a full res compressed jpg. https://rdc.reed.edu/v1/resources/0bd37d70-07da-40f4-8412-1855738352f9.jpg. This can be the reference edm:WebResource.

Can you check with Reed and ask if they have anything other than image files? This pattern will likely not work for audio, video, pdf, etc. The precise format of the master digital file cannot be derived from the OAI-PMH

jallibunn commented 7 years ago

You bet!

jallibunn commented 7 years ago

Response from Reed: We might have added the thumbnail capability after I submitted the sets, and I never updated. I just resubmitted all of the sets now--I don't know if that would make a difference.

I'm not certain what the ideal thumbnail providing mechanism would be. Anneliese said to put a direct link to a thumbnail in the description field, but it sounds like Ethan is expecting something else. Here's the directions we were following: "this is my recommendation: for each record, include the full thumbnail path in its own dc:description field. If the thumbnail url is the only value in a description field, the url will not be mapped to dcterms:description nor dc:description. Instead it will be extracted and mapped to edm:WebResource for DPLA and to dc:relation.hasVersion for Primo."

So, to answer the question, yes, we would have resources that are other than images, but I suppose we're assuming we would always generate a jpg thumbnail. For example, the Wheeler set are PDFs, but the path in the description is for a jpg.

Ethan, where does this information leave us?

@cwyant-alliance

jallibunn commented 7 years ago

Ethan, please address this question. THanks! @cwyant-alliance

ewg118 commented 7 years ago

Okay, here's what I've done.

  1. The dc:description that has a .jpg is always the thumbnail, which is always image/jpeg
  2. They seem consistent in using dc:type across sets to link to DCMI Type. If the type is Image, StillImage, or if the type is set from the harvester drop down menu as Image or StillImage, the edm:object is the dc:description URI before the /thumb, appended with .jpg. The format is always image/jpeg
  3. The edm:isShownAt for the full file, regardless of format is the dc:description before the /thumb. The format cannot be ascertained from the OAI-PMH, but will only be applied if the format has been selected in the drop down menu or the dc:format in the OAI contains the mime type (there is no dc:format in these feeds).

I have found that they have selected dc:Text in cases where they have scanned manuscripts. The file format is still a tif or jpeg, but there's no way to predict this without passing in a format from a drop down menu in the harvester interface.

ewg118 commented 7 years ago

The sets will have to be re-harvested for the images to appear.

jallibunn commented 7 years ago

Thank you! J

On Sep 22, 2017, at 4:00 PM, Ethan Gruber notifications@github.com wrote:

The sets will have to be re-harvested for the images to appear.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Orbis-Cascade-Alliance/harvester/issues/93#issuecomment-331570421, or mute the thread https://github.com/notifications/unsubscribe-auth/AYu26T33BSF9MrPV51zIraoix9Kh794Sks5slC3jgaJpZM4OxEUQ.

Jodi Allison-Bunnell Program Manager Unique and local content, Archives West, archival collection management Orbis Cascade Alliance jodiab@orbiscascade.org (406) 829-6528

Want to schedule a meeting? Please use this tool: https://calendly.com/jodiab

My office: I am in western Montana, in the Mountain time zone

Alliance offices: Orbis Cascade Alliance 2288 Oakmont Way, Eugene, OR 97401 https://www.orbiscascade.org/