Closed rlskoeser closed 3 years ago
Writing a script will be sufficient.
@rlskoeser
Page data here is again giving us problems. Here are the three values that are the most iffy:
pp. 148-[pp.148-149 Divider 1 recto]
0000pp. 146-147
[210pp. 168-169
I don't know what the best way to handle this is if we're being strict about not changing existing data 🤷♀️ let me know what you think is best.
thanks for flagging @kmcelwee — some of these look like they are just typos, will investigate
(for some reason I feel more comfortable with changing image labels than project data! am I being inconsistent?)
@rlskoeser Whatever gets the datasets across the finish line! Since I would feel uncomfortable publishing that data, I'm certainly in support of changing image labels
@kmcelwee 0000pp. 146-147
and [210pp. 168-169
were data entry errors and I have corrected them in Figgy. I'll add deploy notes about updating the manifests.
The label with the divider seems to be accurate — it looks like that's what they came up with for describing insertions that appear next to an unnumbered divider between pages 148 and 149.
LMK if you find any other brackets or page label errors in this export.
✅ When figgy was updated on QA, all the values in the "pages" column looked semantically consistent.
An updated version of insertions.csv
and insertions.json
was shared to the google drive.
As we worked on the project we expanded from annotations to "interventions" to include Derrida's insertions; we never got to the data work for that in terms of structured intervention records in the database, but the naming conventions used for labeling the digital editions track them. It would be really neat to expose that information in a parallel dataset to the accompany the existing intervention/annotation dataset.
All the information is available in our database by way of the canvas labels with the word "insertion"; I think the naming convention will also let us group images of the same item (whether front back or more images for a pamphlet). Canvases are all part of manifests, which are connected to digital editions — so we can link them to the book metadata.
Additional benefit for doing this: copyright is simpler for the insertions than the texts, since PUL has the rights to the insertions and the images of them.
I don't think we have enough information about these to include them in the intervention export, which is why I suggest we create a second one.