Project models - Githubissues

haggis78 commented 4 years ago

Today after class, we (Amber, Connor and I) discussed the Emily Dickinson project as an example of how a text-comparison project might look.

I came across a different model this evening at https://www.dhi.ac.uk/partonopeus/usersguide.htm#collate . It keeps each section of text intact, though I don't think it highlights textual differences, as we would want to do.

A third model is the Piers Plowman Electronic Archive, which conveys the different manuscript readings of an important late-medieval English poem (see, for instance, http://piers.chass.ncsu.edu/texts/Hm/1?view=all). This highlights readings that are unique to each manuscript, and even links each word to its entry in the Middle English Dictionary, but it offers no option (that I can find) for side-by-side comparison.

One difference between all of these models and ours is that these are all poetic texts, which makes them easier to compare since they have numbered lines, while our text is prose with no such clear breaks. (There are clauses that we could artificially number, but the nature of this document is that the thing is pretty much one long sentence!)

Anyway, if you come across any other models that you think we should consider, even if we might only want to adopt some of their features, please post links as replies to this issue. I'll add any more I come across, too.

ebeshero commented 4 years ago

@haggis78 Much depends on how significant you see the comparisons to be to your research question. For the Emily Dickinson project, comparison was central to what the team wanted to track: how did these poems change in their publication history, and how was Dickinson herself building options into her drafting?

Another kind of edition might want to document minor differences without making the reading of line-by-line difference so central to the interface. If your documents represent very different stages in a long process, maybe what matters is articulating where they are similar or connected as this is what you need to help people to see.

With medieval work (different the text-scholarship of 19c materials that I'm more used to), you're often studying fragmentary materials that descend from a lost original, and I see something of this is the case from your project proposal.

Here are some questions to help us take stock of the source materials:

Are they all pretty much complete documents? Or are some of them fragments, with significant chunks missing?
Can you make a sort of preliminary survey--running your eyes over a sampling of the witnesses--to get a sense of what kinds of differences you see?

Your proposal makes it pretty clear you're trying to reconstruct (or is that construct?) as best you can an authoritative text to document this school's origins. I think one of the more helpful discussions of this process comes from Hugh Cayless's work on the Digital Latin Library. Here's a little light reading on the issues of textual comparison in digital format: https://digitallatin.org/library-digital-latin-texts/textual-criticism

The Digital Latin Library editions are exciting (to me anyway), because they basically give the reader the tools to remix the variants--to swap out what appears in the margins or inline. That's basically a dynamic way of making an eclectic edition in which you don't just choose one source to display as the "best" text, but rather allow micro-decisions to guide the process of constructing the best text out of the available witnesses. Take a look at how it works here: https://ldlt.digitallatin.org/library/texts/urn:cts:latinLit:phi0830.phi001.dll_1/poem6

Click on one of the variant bubbles in the margin, and then click on the text of that variant to choose it so it goes inline. Pretty cool!

How does it work? JavaScript is the simple answer, but really it's built on a foundation of TEI critical apparatus markup. We can work on designing an interface that helps show/hide as well as move variants on the web interface for the edition you build out of XML. Hugh (@hcayless) is on TEI Technical Council and used to be our chair--I think he is coming to Pittsburgh next summer to teach in a Digital Editions as Interfaces institute that @djbpitt is organizing. And he's certainly someone we can reach out to as your project develops for advice--though we probably don't want to pester him too much. (I've just pinged him here.)

Also, here's a little light reading on stemma work that I found as I was cobbling together this response to your post! https://journal.digitalmedievalist.org/articles/10.16995/dm.51/

ebeshero commented 4 years ago

I realize just now that what I sent you was, of course, another poem. It's so convenient that poems get semantic lines! Probably you are going to need a strategy to break your text into meaningful units. I tend to like clause-by-clause solutions, but you might just want to encode according to lines on the page. The problem with that, of course, is that you have multiple versions of the document, and the comparable parts aren't going to be complete lines. Just thinking out loud here.

haggis78 commented 4 years ago

Thanks for those resources! We're in good shape in one respect: none of the copies we're dealing with is damaged or fragmentary. In fact, three of them are printed. I'm handling the oldest and most difficult, which is a file copy of the original, dating to 1541. All the other copies, both print and MS, are eighteenth- or nineteenth-century.

You are correct that there is a twofold purpose here: to establish the text of the lost original, and in the process to construct a stemma of the surviving copies. As for the first task, the early file copy isn't necessarily the best, since it was made by a scribe who abbreviated certain 'boilerplate' parts. The second task, the stemma, is itself intended as a sampling of a larger work: all the MS copies (except that early file copy) appear as an appendix to copies of the statutes of St David's Cathedral. My hope is that if we can establish a reliable stemma for the appendix we're studying, then we will at least have a hypothetical stemma for the whole of the Statutes to which it is appended.

Complicating factors include a) for all I know, the manuscripts derive not directly from the lost original letter but either from the first printed copy (1719) or a transcript that was used in the preparation of that printing; and b) all of the manuscripts except one stayed in the possession either of St David's Cathedral or in the personal possession of individual clergy of the Cathedral, so there was cross-checking not only for subsequent corrections, but possibly also during copying (i.e., one copyist may have had 2 or 3 exemplars open in front of him while he worked).

ebeshero commented 4 years ago

Here's a code block in markdown:

start = root
root = element root {stuff+}
stuff = element stuff{text}

haggis78 / BreconChurch

Project models #2