ODD Updates - Githubissues

@ebeshero Please forgive the radio silence; I've needed to shelf this project for a couple weeks while I worked on other pressing schoolwork. This past weekend, I spent reflecting on my tagging conventions quite a bit, generally unhappy with the way that they seemed repetitive and uninformative. I decided to implement some changes to my custom attributes (in a step towards hopefully assembling this feature library, assuming I can further refine the schema to the point that I can preemptively quantify linguistic trends in the volume) that I think work more effectively to describe my texts.

TEI Elements

w
- Nothing has changed in this department. I'm still tagging loanwords at the word-level with this tag
phr
- This new tag is used to wrap anglicized phrases, and designed to contain at least two w elements.

I will first finish the volume by tagging only phrases and words. If time permits (which I hope it does) I will try to accomplish as much work as possible on any evidence of dialectal language in the volume. As of now, though, the outright anglicisms are more important to my project.

TEI Conformant Attributes

@type (for phr)
- 'noun', 'prep'
- The @type attribute will be used with the phr element to identify which type of phrase is identified in the volume. As of late 1921, the only two different anglicized phrases are noun and prepositional ones.
@pos
- 'noun', 'verb', 'adj', 'prep', 'itArt'
- The very last attribute value is for Italian articles, used in instances where the loanwords in the volume are gendered as if they were a proper Italian term -- an indication of linguistic adaptation.
  Custom TEI Attributes
@add
- 'itSuffix', 'engInflection', 'vbStem'
- This is used any time that a morpheme is added to a word. The endings I've seen thus far (and probably the only ones I'll find) are Italian suffixes (vocalic endings), English inflections (which is very interesting), and verb stems (for English verbs that are italianized).
@gender
- 'm', 'f', 'ambig'
- The role this attribute plays has not changed.
@modification
- 'y', 'n'
- This new attribute, I think, more accurately informs me of a word's condition regarding adaptation. If there are any modifications made to the English word to make it seem more Italian, then it receives a @y value. If the word is used exactly as it would be in English, then that value is @n.
@char
- 'j', 'k', 'x', 'w', 'y'
- This value also has not changed. Any time an un-Italian character appears in a loanword, I mark it.
@orth
- 'eng', 'it'
- This other new attribute is used to define the prevailing grammar of a word's construction. If there are any un-Italian components of a word that are not adapted, then the prevailing grammar is @eng. This attribute could use more refinement, since in some cases there is evidence of an unobstructed loan and italianization. I plan to re-address this once I have more data in my corpus tagged.

So, this is where I am now! I think that this tagging system works more effectively for my work, and I am excited to continue updating and focusing its parameters. Let me know what comments you have when you have some time to kill (I must admit it's a bit strange not talking to you with as regular a cadence as we have had for over a year, haha).

@zme1 Glad you're back to work on it--I figured you were caught up in other semester demands, which is totally understandable. It sounds like you're cutting down on unnecessarily duplication in your markup, which should make your life a little easier. (I also take it that your attribute values listed above are meant to be surrounded in quotation marks and not prefixed by the @ symbol.) I think I understand what's likely to happen with @orth a little better now (I remember I had questions about that before)--since sometimes you expect both "eng" and "it" values. You could simply mark it that way, permitting both values to be simultaneously present as in `.

Here's some advice moving forward on your markup: Write up a Schematron to help prevent yourself from making stumbly errors--even if you believe you're going to get it right b/c you trust yourself(!) I was in this state a week ago on the Frankenstein project, as I'm amending the output from CollateX (which compares five versions of the Frankenstein novel). I was using XPath to guide me to find the right sorts of corrections to make, and there were many, many of them...and I noticed myself getting tired. I'd been lazy about writing up Schematron for this because, well, we always think we're going to get it right--but I suddenly found myself making errors--so I saw the light and wrote a Schematron, and found many more(!) Things are easier with Schematron to guide you...You can write a rule, for example, that catches if you put in mismatched values for @char. (Any of those values would be valid with your ODD, but if your word doesn't actually contain a letter "k", and you mark <w char="k"/> Schematron could catch that even if your eyes are too tired to get it! Always write up some Schematron. We can patch it into your ODD, or you can just write it separately and associate it on top of the ODD RNG schema.

zme1 / toscana

ODD Updates #51

TEI Elements

TEI Conformant Attributes

Custom TEI Attributes