Closed kmcelwee closed 2 years ago
references.csv
or de-la-grammatologie_references.csv
?URI | Title | Short Title |
---|---|---|
https://derridas-margins.princeton.edu/titles/155/ | Lettre à M. de Saint-Germain | Lettre à M. de Saint-Germain du 26 février 1770 |
https://derridas-margins.princeton.edu/titles/153/ | Lettre au prince de Würtemberg | Lettre au prince de Würtemberg du 10 novembre 1663 |
https://derridas-margins.princeton.edu/titles/124/ | Oeuvres complètes | Oeuvres complètes de Karl Abraham |
https://derridas-margins.princeton.edu/titles/158/ | Oeuvres Complètes, tome II | Oeuvres complètes de J.-J. Rousseau, vol. II |
https://derridas-margins.princeton.edu/titles/177/ | Œuvres complètes, tome I | Oeuvres complètes de J.-J. Rousseau, vol. I |
https://derridas-margins.princeton.edu/titles/230/ | Œuvres complètes, tome III | Oeuvres complètes de J.-J. Rousseau, vol. III |
https://derridas-margins.princeton.edu/titles/199/ | Œuvres complètes, vol. VI | Oeuvres complètes de Franz Kafka |
@kmcelwee so glad you are flagging all of these!
- Brackets on pages? Should I explain that?
Please remind me what this is! 😆
KM: Here's some examples of values in our "pages" column for annotations.csv
🥴 p. 45
[572]
p.257
Back flyleaf 1 verso
.
references.csv
orde-la-grammatologie_references.csv
?
Let's simplify and just call it references.
KM: ✅ added to todo list
Can we come up with a more meaningful name for the books? I think instances won't make sense outside our system.
KM: library.csv
?
- Work URI field has one value https://findingaids.princeton.edu/collections/RBD1/c43 and it’s a finding aid. It contains the same value as that in the catalog_uri field. Drop this?
Yeah, drop it if it's redundant.
KM: ✅ added to todo list
- Short titles aren't necessarily shorter (?!)
Weird! I wonder if we're generating them wrong. From the examples you gave, I think we either drop the field or relabel it so the field name is more accurate. But if all that information is included in other fields, let's drop it.
KM: I don't think the short_title field adds information. I'm going to add that to our todo
References IDs aren't unique. Should they be?
Wow! This is interesting, and a good thing to catch. I think the team must have entered multiple copies of the same reference when they weren't sure which copy of a book to link it to. I don't think we should try to change that now, but just reflect the research that was done. It seems like the simplest thing would be to just allow it not to be unique and make sure we note this in the documentation. (Combination of reference id & title id should be unique, though, if it matters).
I see that the search results for these links returns only a single reference — that probably means the indexing wasn't configured properly to handle multiple versions of the same reference, which is probably one of the reasons this never got caught. I think it's ok to live with that; let's just try to document clearly & briefly what the situation is.
KM: Added documentation to-do above ✅
Two books are in annotations but not in instances
This is surprising to me, because team members were only supposed to document annotations that related to references in de la Grammatologie. I guess we should revise the queryset filter on the instance export to include them — seems like the simplest solution to me, but I'm open to suggestions.
KM: Added to todo list ✅
@rlskoeser I edited my own responses in your comment, because I thought it would be more clear. Let me know if I should never do this again haha 😄
Wow, that is a pretty non-obvious way to reply! 😆 🤯 (I know threaded replies can get a bit much, but yeah, please don't do that in future 🙂 )
I like library.csv
!
I'll have to look into the page numbers to be sure (and if it's not obvious and there isn't any project documentation about it, then maybe we won't be sure!). My guess is that brackets are to indicate the page number is not actually printed. Maybe make me an asana task?
@rlskoeser I think I've taken care of almost everything. I have some questions inline, and I'm sure you'll have comments. Here are my questions / takeaways from this PR. Looking forward to your feedback when I return!
readme_info.py
, which I copy and pasted. It was very useful, thank you! Happy to remove it though if it’s unnecessaryFurther back-and-forth edits will take place with PRDS, but w/r/t development, this is done
requirements.txt
datapackage.json
. For each appropriate field...Data cleaning
references.csv
instances.csv
tolibrary.csv
work_uri
fieldNotes
cd data; frictionless describe *.csv > datapackage.yaml
(frictionless data observes local paths when running validate