RJP43 / CitySlaveGirls

The Restoration of Nell Nelson
http://nelson.newtfire.org
5 stars 4 forks source link

The tagging saga continues ... Version Control #49

Closed RJP43 closed 8 years ago

RJP43 commented 8 years ago

@ebeshero @spadafour So when we are marking up the versoning between texts I am trying to figure out if it makes more sense to have elements like <placeName> and <persName> and <orgName> surrounding the <app><rdg> setup or to have the just mentioned elements sitting inside each of the different <rdg> elements.

Easier to explain with an example:
Here is the text as it originally appears in the articles:
the labyrinth that is known as H. Schultz &amp; Co.'s paper-box manufactory, 34 to 38 East Randolph street.

Here is the text as it originally appears in the Barkley text:
the labyrinth that is known as H.S. &amp; Co.'s paper-box manufactory, on R--- street.

Option 1 the labyrinth that is known as <orgName ref="#HSC" type="exposedCompany"><app><rdg wit="#CT021">H. Schultz</rdg><rdg wit="#WSGC23">H.S.</rdg></app> &amp; Co.'s paper-box manufactory</orgName>, <placeName ref="#HSC" type="address"><app><rdg wit="#CT021">34 to 38 East Randolph</rdg><rdg wit="#WSGC23">on R---</rdg><!--rjp: em dash???--></app> street</placeName>
I like this way because we can see exactly what parts are kept versus changed and there isn't the repetition of elements so we can still do a count of all the companies by <orgName> and it won't be distorted.

Option 2 the labyrinth that is known as <app><rdg wit="#CT021"><orgName ref="#HSC" type="exposedCompany">H. Schultz &amp; Co.'s paper-box manufactory</orgName></rdg><rdg wit="#WSGC23"><orgName ref="#HSC" type="exposedCompany">H.S. &amp; Co.'s paper-box manufactory</orgName></rdg></app>, <app><rdg wit="#CT021"><placeName ref="#HSC" type="address">34 to 38 East Randolph street</placeName></rdg><rdg wit="#WSGC23">on <placeName ref="#HSC" type="address">R--- street</placeName></rdg></app>

I am leaning towards Option 1, but I wanted opinions?

I am tagging the Dickinson team here as well because they are used to this versioning setup. @nlottig94 @brookestewart

RJP43 commented 8 years ago

Also we had briefly talked about whether we want to do versioning to make when there is a variation in the "appearance" of text. So for example in the original articles there may be a comma somewhere that doesn't appear in the Barkley text or in the original articles frequently when the title of the newspaper is referenced it is in all caps whereas in the Barkley text just Times is italicized. I imagine markup for these two situations would look like this:

Example 1: <app><rdg wit="#CT021">,</rdg><rdg wit="#WSGC23"></rdg></app>

Example 2: <app><rdg wit="#CT021"><name rend="case(allcaps)" ref="#CT">THE TIMES</name></rdg><rdg wit="#WSGC23">the <name rendition="italic" ref="#CT">Times</name></rdg></app>

However, we (@spadafour and I) couldn't really see the purpose behind doing this. It is more obvious why these types of things should be kept for poetry, but we couldn't come up with any strong reasoning to keep it for newspaper articles. I think we are going to just make whatever the representation in the original articles be what is kept, but I wanted to see if anyone had a reason we should consider for marking those type of differences between versions. If we would decide to do the markup for this kind of versioning I would argue that the <name> elements that identify when the newspaper is being referenced would have to sit inside of the <app><rdg> setup because of the varying @rend attributes that determine how the text is rendered.

@ebeshero @ghbondar @nlottig94 @brookestewart @djbpitt

ebeshero commented 8 years ago

@RJP43 These are good questions. You want to establish a clear methodology for what you are going to mark and what you are going to ignore silently as not relevant. You can create an editorial statement that will sit in your TEI header, in the <encodingDesc> that will explain openly what you are ignoring and why you and your team are ignoring it. You can say that the use of blockcaps doesn't matter semantically and that these are not marked in your comparison of versions. Here's Chapter 2 of the TEI Guidelines on the TEI header, which you'll want to be thinking about developing more.

I think it's fine to wrap the <app> elements in your persName, placeName, orgName (etc.) elements. Keep the coding as simple as you can, and keep in view the purpose of your versioning.

RJP43 commented 8 years ago

We have decided on the markup for version control. Please refer to our codebook and/or our first version controlled article 8-19