Open wlpotter opened 2 years ago
First step is to get the list of titles and their structures. (Note that normalization is probably a prerequisite for #89 ).
Sheet with titles, their structures, and some data about attribute and child element usage: https://docs.google.com/spreadsheets/d/1e8AWvx-2drh9o7dmYG6eyfw18S95_40--kjiFN71CSA/edit?usp=sharing
@type="supplied"
as redundant (it is all assumed to be from Wright)@resp
for ones we've named
@resp
already@resp
)Leave pers and place Name should remain; if the URI on a persName matches that on an author element (wait on #91 ), mark as 'attributed author'.
For titles in titles (tagged either as tei:title or with a tei:ref, and maybe other ways), pull these out once the msItem ids are stable (wait on #58 ). We will have a list with ms and ms part uri ; msItem title ; msItem xml:id; xpath to the item; item title's text node; text node of the child title ; URI of overall work ; URI of child work ; author info and other useful identifying info like rubric, incipit, etc. Store in a csv and reference when creating and updating work authority records from the ms data
Also want to normalize the element used to tag these child titles.
Note to self: leaving this in backlog as it is low priority right now. But it should be split into several subtasks once we're ready to work on it. For example, #54 is one such sub-task
There are several different formats of titles and a lot of variance regarding which elements and attributes are allowed and required on title elements. To sort that out we need a sheet with title elements and their structures.
It would also be worth collecting together the issues related to titles and their data format(s). This will help us formulate and implement rules regarding unspecified contents, unnamed sub-parts of named works, etc.