RJP43 / CitySlaveGirls

The Restoration of Nell Nelson
http://nelson.newtfire.org
5 stars 4 forks source link

Moving into TEI #6

Closed RJP43 closed 8 years ago

RJP43 commented 8 years ago

So as we have discussed the Nell Nelson project currently uses a customized RelaxNG schema of approx. 40 different elements and attributes and we need to move forward with converting these tags into TEI tags.

Three Steps to Convert to TEI:

  1. Choosing TEI elements/attributes that correspond well with the tags we already have in place
  2. Choosing other aspects of the project we might be interested in coding and choosing TEI tags that will best represent this new research
  3. Using regualr expressions to find the old tags and then going forward with replacing them with the new TEI tags.

    These are the main components of our files that we need to make sure we represent with our new TEI tags:

    • structure --- newspaper versus book (ie. newspaper - headlines, subtitles, date information and books - chapters, chapter titles, paragraphs)
    • versioning --- we want to find a way to link all three sources together by pointing out what is different and this is called versioning (we need to figure out the best way to do this so that if something is mentioned in the original articles and not mentioned in the book sources we can notice the variation to see if there are trends in what is excluded -- trust me there is!)
    • conversation or dialogue --- an interesting aspect that previous editor Shane Daube pursued was the varying connotations between people speaking with one another. We need to find a way to convert these voice tags.
    • content --- descriptions of working conditions, mentions of unionization/labor reform, descriptions of living conditions, etc.
    • references --- to people, places (locations), companies

      To begin:

  4. Each of us needs to read through the couple of articles and chapters already marked up and take note of elements in place and aspects of the text that need developed by adding new elements
  5. Go into the TEI guidelines and find sensible elements and attributes to convert to
  6. Begin discussion in this issue with TEI elements/attributes that make sense that we can agree on

After we have decided on our tags I will write up a codebook and we can begin converting the few documents we have marked up into TEI and possibly begin the markup of some of the other transcribed and need-to-be transcribed files.

@KariWomack @spadafour @CodyKarch @rCarls

ebeshero commented 8 years ago

Hey, Nelson team! I've been working with Becca on learning some new TEI coding, to help convert the contents of a Really Badly Formatted Table from the 19th-century McEnnis text into Feature Structure encoding that is SO much easier to read and process. This coding will be useful for other tables or charts you run across in the McEnnis and/or other texts. I've just plotted out an extremely simple TEI header that your team should build on. Have a look at the new TEI file here. When you sync your CitySlaveGirls GitHub, you'll find this as WSGATableCh1.xml inside the McEnnisWhiteSlaveGirlsOfAmerica_XML directory. Go take a look!

@RJP43 @KariWomack @CodyKarch @rCarls @spadafour @ghbondar

RJP43 commented 8 years ago

@CodyKarch will be working with the structuring elements/attributes

@spadafour will be working witht the conversation and dialogue elements/attributes

RJP43 commented 8 years ago

@rCarls Could you work on finding TEI tags fore referencing specific information?

RJP43 commented 8 years ago

For this week's 10 pts each of you will need to comment on this issue with TEI elements you feel best replace the specific category of elements you have been assigned. Please include the elements' names and attribute pairing(s) as well as the chapter from the TEI guidelines where you found these elements. If you encounter issues comment in this issue so the instructors can take note of your efforts in completing this task.

Thanks @CodyKarch , @spadafour , @rCarls !

spadafour commented 8 years ago

I think the best way to handle conversion of conversation would be two wrap the element in <sp> elements, with the speaker as an attribute within.

Ex: <sp who="mascVoice">Manly words being derogatory toward women.</sp>

<sp> is a speech element. I think it'd work well for macVoice and femVoice, especially since it holds the who attribute.

http://www.tei-c.org/release/doc/tei-p5-doc/en/html/DR.html

However, the nellVoice element currently in there will have to be reworked; sometimes it represents dialogue, and other times it represents connotations (noted so with an attribute).

Connotation only: Sometimes I can be <nellVocie connotation="sarcasmWit"> so punny.</nellVoice>

Quote and connotation: <nellVoice connotation="sarcasmWit">Yes, mascVoice, I am being snippy</nellVoice>

This is made up, but you get the idea. Because this one tag is being used in two separate situations, there is no way to tell the difference through XPath or Regex; it's entirely mixed up with actual dialogue and needs to be fixed manually.

Speaking of connotations...

Many of the dialogues (at least nellVoice) contain connotations as attributes. We should separate those out into their own elements (wrapped inside the <sp> element if being pulled out of a quote). We can use the <interp> element and type attribute, which is just a way of inserting an annotator's interpretation of a given bit of text.

http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-interp.html

So a quote would NOW read: <sp who="femVoice"><interp type="sarcasmWit">I'm being sarcastic!</interp></sp>

RJP43 commented 8 years ago

@spadafour Fantastic! To indicate a conversation and grab all of the <sp> of one conversation together perhaps then we can use <spGrp> found here in the TEI

I like the <interp> idea too we can add the @respons attribute so each editor has an @xml:id and then each interpretation can be linked to an interpreter found here in the TEI

@spadafour and @ebeshero what do you think of using <purpose> found here in the TEI for the snippets of Nelson text that are outside of conversation; however, we could just use <interp> just around that text too just not inside of <sp> and identify it as Nelson using an @ref with an xml:id associated with Nelson. @refcan be found here in the TEI

So using @spadafour example it would turn this: Sometimes I can be so punny. in to this: Sometimes I can be <interp respons="#(id for editor") ref="#nNelson "type="sarcasmWit"> so punny.</interp>

I mention <purpose> for two reasons:

  1. I would like to consider the attribute values suggested -- @type with the values they list
    persuade: didactic, advertising, propaganda, etc. express: self expression, confessional, etc. inform: convey information, educate, etc. entertain:amuse, entertain, etc.
    and the @degree perhaps we can specify the extent to which this purpose predominates
  2. It might be more aligned with what Nelson was doing with her little snippets of witty enlightenment.
    One thing to realize is a lot of the responses to the articles and even the few references to Nelson's work in more modern times hold value on the way Nelson was able to communicate how she felt about what she was exposing. Her comments frequently had a way of expressing very specific emotions along with vivid descriptions or rather each comment showed purpose and frequently it is these purposeful snippets that drew responses then and now.

<purpose> can still hold @ref and @respons

Well Done Rob ! @spadafour

RJP43 commented 8 years ago

@rCarls and @CodyKarch this is the kind of back and forth that is critical we have for each element/element sets we are replacing with TEI.

rCarls commented 8 years ago

Some good news, it appears that all of the elements surrounding names in the currently coded articles fit the TEI: the tags are . That's not to say we can't narrow it down some more, however. We could add @ ref with something specific to each person, like the "#designation" that the Doctors Bondar use in the Mitford letter that is linked as an example in part 2 of the TEI lookup excercise we did earlier in the semester. For exampled, we could change the current Mr. Goss to something along the lines of Mr. Goss.

Same deal with places, was used. We could use @ ref again to specify a little more, couldn't we? So if we're looking for Chicago, wrap it in Chicago.

Everything wrapped in a tag is already TEI-friendly.

On to numbers, and @ type are ok, but @ amount doesn't fit the bill. We could change @ amount to @ value and keep the subject matter the same, could we not?

CodyKarch commented 8 years ago

Here is the bundle of what I am thinking for structure: In the articles; we can switch these things: <newspaperTitle> becomes ( I'm not sure what ) <date> becomes <dateline> <num> becomes ( I'm not sure what ) <seriesTitle> becomes <byline> <headLine> becomes mostly <argument> or rarely <quote> <articleBody> becomes <body> Here is my sources for these thoughts----->http://www.tei-c.org/release/doc/tei-p5-doc/en/html/DS.html#DSDTB In the books; we can switch these things: Wrap the whole chapter in <seg>, place the chapter number under <num>, place the chapter title under <byline>, and any following paragraphs under <p>

As seen above, I'm not 100% sure about this, but I think this could form a meaningful structure.

RJP43 commented 8 years ago

Ok, all of our source files are officially moved into TEI... meaning all of the XML files have TEI headers and the TEI super-structure and some of the most important contextual elements have been transferred to TEI as well. We still need to go back through all of the files and do additional mark-up including version-ing and applying the new TEI tags for dialogue and voice tags.

ebeshero commented 8 years ago

@RJP43 I've made some corrections and revisions to simplify your TEI Header a bit. Make sure these make sense.

RJP43 commented 8 years ago

@ebeshero thank you. I have reviewed your corrections and they all make sense. I have edited the document accordingly so as to remove the extra comments and simplify the header even further so students can easily read through it and understand more clearly where they are expected to edit it. I will keep you posted as I get the other headers revised for the McEnnis Text and the Barkley Text

RJP43 commented 8 years ago

<said @who @ana>