As a researcher, I want to see data about known interventions so I can inspect and analyze Derrida's non-textual annotation practices.

rlskoeser commented 3 years ago

As we worked on the project we expanded from annotations to "interventions" to include Derrida's insertions; we never got to the data work for that in terms of structured intervention records in the database, but the naming conventions used for labeling the digital editions track them. It would be really neat to expose that information in a parallel dataset to the accompany the existing intervention/annotation dataset.

All the information is available in our database by way of the canvas labels with the word "insertion"; I think the naming convention will also let us group images of the same item (whether front back or more images for a pamphlet). Canvases are all part of manifests, which are connected to digital editions — so we can link them to the book metadata.

Additional benefit for doing this: copyright is simpler for the insertions than the texts, since PUL has the rights to the insertions and the images of them.

I don't think we have enough information about these to include them in the intervention export, which is why I suggest we create a second one.

kmcelwee commented 3 years ago

Writing a script will be sufficient.

kmcelwee commented 3 years ago

@rlskoeser

Page data here is again giving us problems. Here are the three values that are the most iffy:

pp. 148-[pp.148-149 Divider 1 recto]
0000pp. 146-147
[210pp. 168-169

I don't know what the best way to handle this is if we're being strict about not changing existing data 🤷‍♀️ let me know what you think is best.

rlskoeser commented 3 years ago

thanks for flagging @kmcelwee — some of these look like they are just typos, will investigate

rlskoeser commented 3 years ago

(for some reason I feel more comfortable with changing image labels than project data! am I being inconsistent?)

kmcelwee commented 3 years ago

@rlskoeser Whatever gets the datasets across the finish line! Since I would feel uncomfortable publishing that data, I'm certainly in support of changing image labels

rlskoeser commented 3 years ago

@kmcelwee 0000pp. 146-147 and [210pp. 168-169 were data entry errors and I have corrected them in Figgy. I'll add deploy notes about updating the manifests.

The label with the divider seems to be accurate — it looks like that's what they came up with for describing insertions that appear next to an unnumbered divider between pages 148 and 149.

LMK if you find any other brackets or page label errors in this export.

kmcelwee commented 3 years ago

✅ When figgy was updated on QA, all the values in the "pages" column looked semantically consistent.

An updated version of insertions.csv and insertions.json was shared to the google drive.

Princeton-CDH / derrida-django

As a researcher, I want to see data about known interventions so I can inspect and analyze Derrida's non-textual annotation practices. #264