Princeton-CDH / derrida-django

Derrida's Margins - Python/Django web application
https://derridas-margins.princeton.edu
Apache License 2.0
8 stars 1 forks source link

As a researcher, I want access to images and annotation zones in the intervention dataset so that I can see the context and do more interesting work. #263

Closed kmcelwee closed 3 years ago

kmcelwee commented 3 years ago

Dev Notes

add these fields to the intervention_data export:

kmcelwee commented 3 years ago

Questions

TODO

@kmcelwee

@rlskoeser

rlskoeser commented 3 years ago

let's use this syntax for iiif image url:

https://derridas-margins.princeton.edu/library/abraham-oeuvres-completes-1966/gallery/images/pp-312-313-insertion-a-verso/iiif/full/full/0/default.jpg

We must have had the wrong syntax when we were trying the percent region before (is this a bug in piffle?).

Here's a test region specified by percentages that returned an image for me: https://derridas-margins.princeton.edu/library/abraham-oeuvres-completes-1966/gallery/images/pp-312-313-insertion-a-verso/iiif/pct:10,10,20,20/full/0/default.jpg

If you can test using that syntax with some actual annotation zones from the db and get the region you expect then I think you can proceed with that.

kmcelwee commented 3 years ago

I was able to use the existing syntax in urls.py to generate a /large/ mode, but when following the provided iiif syntax, I tested out ~90 to see what the status codes were (I didn't want to go through all 1k). Of the 90 tested, I got server errors for the following six URLs:

https://derridas-margins.princeton.edu/library/jakobson-fundamentals-of-language-1963/gallery/images/p-116/iiif/full/full/0/default.jpg
https://derridas-margins.princeton.edu/library/derathe-le-rationalisme-de-j-j-rousseau-1948/gallery/images/p-18/iiif/full/full/0/default.jpg
https://derridas-margins.princeton.edu/library/hegel-enzyklopadie-der-philosophischen-wissenschaften-im-grundrisse-1952/gallery/images/p255/iiif/full/full/0/default.jpg
https://derridas-margins.princeton.edu/library/bloch-lecriture-et-la-psychologie-des-peuples-actes-de-colloque-1963/gallery/images/p-110/iiif/full/full/0/default.jpg
https://derridas-margins.princeton.edu/library/hegel-enzyklopadie-der-philosophischen-wissenschaften-im-grundrisse-1952/gallery/images/p257/iiif/full/full/0/default.jpg
https://derridas-margins.princeton.edu/library/descartes-discours-de-la-methode-1961/gallery/images/p-116/iiif/full/full/0/default.jpg

The same links with a /large/ suffix, however, provided 200 responses:

https://derridas-margins.princeton.edu/library/jakobson-fundamentals-of-language-1963/gallery/images/p-116/large/
https://derridas-margins.princeton.edu/library/derathe-le-rationalisme-de-j-j-rousseau-1948/gallery/images/p-18/large/
https://derridas-margins.princeton.edu/library/hegel-enzyklopadie-der-philosophischen-wissenschaften-im-grundrisse-1952/gallery/images/p255/large/
https://derridas-margins.princeton.edu/library/bloch-lecriture-et-la-psychologie-des-peuples-actes-de-colloque-1963/gallery/images/p-110/large/
https://derridas-margins.princeton.edu/library/hegel-enzyklopadie-der-philosophischen-wissenschaften-im-grundrisse-1952/gallery/images/p257/large/
https://derridas-margins.princeton.edu/library/descartes-discours-de-la-methode-1961/gallery/images/p-116/large/

Did I misunderstand the syntax? We don't have a way to get these links through reverse correct? My current code is stubbed under 762da6b80584e8c008855e7e99f898d420082a14

kmcelwee commented 3 years ago

It appears like the IIIF API has a problem with decimal percent values. If we use str(intervention.iiif_image_selection()) as a starting point to build the URL for the syntax above, we get something like:

https://derridas-margins.princeton.edu/library/levi-strauss-tristes-tropiques-1955/gallery/images/p-264/iiif/pct:19.48,53.45,76.92,30.73/full/0/default.jpg

But we get the following warning:

Error: Unhandled transformation error: Expected integer for top but received NaN of type number

The error goes away when we round to integers: https://derridas-margins.princeton.edu/library/levi-strauss-tristes-tropiques-1955/gallery/images/p-264/iiif/pct:19,53,76,30/full/0/default.jpg

rlskoeser commented 3 years ago

Export looks good. Image urls are great — I tested a handful of both the full page and the annotated region and they resolved and look correct.

Now that I'm browsing this, I see that the "ink" field we added is redundant, since the ink color and pencil are already included in the tags. Do you mind removing it?

kmcelwee commented 3 years ago

@rlskoeser

rlskoeser commented 3 years ago

@kmcelwee let's track revisions for data integrity & publication on the new issue, since we've set out what we intended to do for this story (access to images & annotation zones).

What do you think about removing ink? I'm ok with including that in the cleanup & other revisions you're flagging as part of the publication prep.

Confirming the links are 200s is a great idea; asana task makes sense to me. We will probably need to run it more than once — we want to check the dataset before we publish it, based on the current site; and then when we're working on the archive site we'll want to make sure all the urls referenced in the published datasets are included in the captured version of the site or otherwise handled (e.g. via nginx redirects).

kmcelwee commented 3 years ago

@rlskoeser

What do you think about removing ink? I'm ok with including that in the cleanup & other revisions you're flagging as part of the publication prep.

I just double checked and the ink information is always in tags. I don't think it's important enough to separate out, and I also think it would be easy for a researcher to get that information from the tags anyway. I agree it's redundant and should be removed

Confirming the links are 200s is a great idea; asana task makes sense to me.

I'll make two asana tasks

rlskoeser commented 3 years ago

I propose we track removing the ink field on the issue for the data package & related cleanup work so we can close this issue

kmcelwee commented 3 years ago

@rlskoeser Great, sounds good. It's all there: https://github.com/Princeton-CDH/derrida-django/issues/277