open-editions / corpus-joyce-ulysses-tei

James Joyce's novel Ulysses in TEI XML. Work-in-progress.
20 stars 17 forks source link

Adding first cross references. #44

Closed charlesreid1 closed 7 years ago

charlesreid1 commented 7 years ago

This adds a cross-reference.xml document, several tags, and the very first example of a cross reference: "agenbite of inwit".

This follows the idea in #39, which adds <ref> tags in the text and groups them using a <link> tag in cross-references.xml, with the suggestion in #42, of enumerating the <ref> tags using the nearest <lb> tag.

Some additional "header"/padding tags may need to go in cross-references.xml, but for now it's just a sequence of <link> tags wrapped in a <TEI> tag.

This also demonstrates the addition of a <note> (with target attribute) to expand on and explain the cross-reference.

yellwork commented 7 years ago

Wonderful! Going to get cracking myself too. I wonder if there’s any way to mass import the findings in William M. Schutte’s Index of Recurrent Elements in James Joyce’s Ulysses (1982)? I’ll track down a copy and take a look.

JonathanReeve commented 7 years ago

Great idea. If you can't find a copy, let me know, and I'll see if I can scan the copy we have in our library.

I've also been thinking about ways of computationally identifying recurring phrases. The best I can come up with is, make a formula adding the sum of probabilities of each word, the number of words, and the number of times the n-gram repeats.

yellwork commented 7 years ago

I just checked and we have a copy in my library – though thanks for offering.

I can’t remember what edition Schutte cues his elements to (the JJQ episode-by-episode version used the 1934 and 1961 editions), but I bet there’s a way to grab all his citations and computationally transform them into 1984 episode.line numbers. Maybe that could form the basis for more speedily getting his recurrences into our data as <ref> and <link> tags? But let me get the book and take a look at it first; see just what he takes note of.