Closed RJP43 closed 8 years ago
Hey, Nelson team! I've been working with Becca on learning some new TEI coding, to help convert the contents of a Really Badly Formatted Table from the 19th-century McEnnis text into Feature Structure encoding that is SO much easier to read and process. This coding will be useful for other tables or charts you run across in the McEnnis and/or other texts. I've just plotted out an extremely simple TEI header that your team should build on. Have a look at the new TEI file here. When you sync your CitySlaveGirls GitHub, you'll find this as WSGATableCh1.xml inside the McEnnisWhiteSlaveGirlsOfAmerica_XML directory. Go take a look!
@RJP43 @KariWomack @CodyKarch @rCarls @spadafour @ghbondar
@CodyKarch will be working with the structuring elements/attributes
@spadafour will be working witht the conversation and dialogue elements/attributes
@rCarls Could you work on finding TEI tags fore referencing specific information?
For this week's 10 pts each of you will need to comment on this issue with TEI elements you feel best replace the specific category of elements you have been assigned. Please include the elements' names and attribute pairing(s) as well as the chapter from the TEI guidelines where you found these elements. If you encounter issues comment in this issue so the instructors can take note of your efforts in completing this task.
Thanks @CodyKarch , @spadafour , @rCarls !
I think the best way to handle conversion of conversation would be two wrap the element in <sp>
elements, with the speaker as an attribute within.
Ex:
<sp who="mascVoice">Manly words being derogatory toward women.</sp>
<sp>
is a speech element. I think it'd work well for macVoice and femVoice, especially since it holds the who
attribute.
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/DR.html
However, the nellVoice element currently in there will have to be reworked; sometimes it represents dialogue, and other times it represents connotations (noted so with an attribute).
Connotation only:
Sometimes I can be <nellVocie connotation="sarcasmWit"> so punny.</nellVoice>
Quote and connotation:
<nellVoice connotation="sarcasmWit">Yes, mascVoice, I am being snippy</nellVoice>
This is made up, but you get the idea. Because this one tag is being used in two separate situations, there is no way to tell the difference through XPath or Regex; it's entirely mixed up with actual dialogue and needs to be fixed manually.
Speaking of connotations...
Many of the dialogues (at least nellVoice) contain connotations as attributes. We should separate those out into their own elements (wrapped inside the <sp>
element if being pulled out of a quote). We can use the <interp>
element and type
attribute, which is just a way of inserting an annotator's interpretation of a given bit of text.
http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-interp.html
So a quote would NOW read:
<sp who="femVoice"><interp type="sarcasmWit">I'm being sarcastic!</interp></sp>
@spadafour Fantastic! To indicate a conversation and grab all of the <sp>
of one conversation together perhaps then we can use <spGrp>
found here in the TEI
I like the <interp>
idea too we can add the @respons
attribute so each editor has an @xml:id
and then each interpretation can be linked to an interpreter found here in the TEI
@spadafour and @ebeshero what do you think of using <purpose>
found here in the TEI for the snippets of Nelson text that are outside of conversation; however, we could just use <interp>
just around that text too just not inside of <sp>
and identify it as Nelson using an @ref
with an xml:id
associated with Nelson.
@ref
can be found here in the TEI
So using @spadafour example it would turn this:
Sometimes I can be Sometimes I can be <interp respons="#(id for editor") ref="#nNelson "type="sarcasmWit"> so punny.</interp>
I mention <purpose>
for two reasons:
@type
with the values they listpersuade: didactic, advertising, propaganda, etc. express: self expression, confessional, etc. inform: convey information, educate, etc. entertain:amuse, entertain, etc.
@degree
perhaps we can specify the extent to which this purpose predominates <purpose>
can still hold @ref
and @respons
Well Done Rob ! @spadafour
@rCarls and @CodyKarch this is the kind of back and forth that is critical we have for each element/element sets we are replacing with TEI.
Some good news, it appears that all of the elements surrounding names in the currently coded articles fit the TEI: the tags are
Same deal with places,
Everything wrapped in a
On to numbers,
Here is the bundle of what I am thinking for structure:
In the articles; we can switch these things:
<newspaperTitle> becomes ( I'm not sure what )
<date> becomes <dateline>
<num> becomes ( I'm not sure what )
<seriesTitle> becomes <byline>
<headLine> becomes mostly <argument> or rarely <quote>
<articleBody> becomes <body>
Here is my sources for these thoughts----->http://www.tei-c.org/release/doc/tei-p5-doc/en/html/DS.html#DSDTB
In the books; we can switch these things:
Wrap the whole chapter in <seg>,
place the chapter number under <num>,
place the chapter title under <byline>,
and any following paragraphs under <p>
As seen above, I'm not 100% sure about this, but I think this could form a meaningful structure.
Ok, all of our source files are officially moved into TEI... meaning all of the XML files have TEI headers and the TEI super-structure and some of the most important contextual elements have been transferred to TEI as well. We still need to go back through all of the files and do additional mark-up including version-ing and applying the new TEI tags for dialogue and voice tags.
@RJP43 I've made some corrections and revisions to simplify your TEI Header a bit. Make sure these make sense.
@ebeshero thank you. I have reviewed your corrections and they all make sense. I have edited the document accordingly so as to remove the extra comments and simplify the header even further so students can easily read through it and understand more clearly where they are expected to edit it. I will keep you posted as I get the other headers revised for the McEnnis Text and the Barkley Text
<said @who @ana>
So as we have discussed the Nell Nelson project currently uses a customized RelaxNG schema of approx. 40 different elements and attributes and we need to move forward with converting these tags into TEI tags.
Three Steps to Convert to TEI:
These are the main components of our files that we need to make sure we represent with our new TEI tags:
To begin:
After we have decided on our tags I will write up a codebook and we can begin converting the few documents we have marked up into TEI and possibly begin the markup of some of the other transcribed and need-to-be transcribed files.
@KariWomack @spadafour @CodyKarch @rCarls