ebeshero / Amadis-in-Translation

a project to apply TEI markup to investigate early modern Spanish editions of Amadis de Gaula and their translations into English and French from the 1500s to the early nineteenth century.
http://amadis.newtfire.org
GNU Affero General Public License v3.0
4 stars 6 forks source link

FS Schematron and Translation descriptors #57

Open HelenaSabel opened 8 years ago

HelenaSabel commented 8 years ago

I’ve updated a visualization of the translation matches with the descriptors we’ve discussed last week. Take a look and please check if everything looks sensible: http://htmlpreview.github.io/?https://github.com/HelenaSabel/Amadis-in-Translation/blob/dev/html/analysis-1.html

I’ll try to get chapter two ready for my visit so we can go through it together again. You can find here an updated version (with the segmentations, I mean): http://htmlpreview.github.io/?https://github.com/HelenaSabel/Amadis-in-Translation/blob/master/html/Chapter2.html

See you in a few hours!

HelenaSabel commented 8 years ago

I created a schematron to keep track of the descriptors we are using and its implementation in the feature structures files. My plan for tomorrow is to confirm the categories, subcategories and labels we are using, and correct the Schematron accordingly. We also need to approve a workflow so Stacey and I will work on the feature structures files, decide how we will proofread and give feedback to each other, and how we will ask for Elisa’s input (I might need to do that often: for example, I find tricky to make the difference between “literal” and "altered syntax" because you have to be sure if a particular construction would be valid in early 19th century English).

ebeshero commented 8 years ago

@HelenaSabel See you shortly! Stacey is here, but won't be able to stay long after you arrive because of the snow! But she'll be back tomorrow to have a long meeting about Chapter 2. Two things to think about:

HelenaSabel commented 8 years ago

The Schematron is in a different file because it only needs to be associated to the FS files. The rules in themselves have no complications, what needs to be confirmed is the categories and labels (so I’ll change now liberal to aesthetic). So we have 5 main categories: literal, approximate, addition, omission, and mistranslation. Literal allows the subcategory "close". Approximate allows: cultural, aesthetic, compressed, compressed by omission (as a particular type of compression), aesthetic with added contents (as a particular type of aesthetic adaptation), altered syntax, and antecedent clarification, reported speech, and direct speech as particular types of 'altered syntax'. Does this sound sensible? @setriplette

ebeshero commented 8 years ago

It looks good to me, but let's have Stacey look at it and be sure! --E (I just added a tag to ping her, and probably we'll look at it together when you get here!) @HelenaSabel

ebeshero commented 8 years ago

@HelenaSabel Want to push the new files via "pull request" to our main project repo? Then we can pull it into the working forks and branches...

HelenaSabel commented 8 years ago

Since it is a working version and not a stable one, I’ll requested to the Elisa branch.

ebeshero commented 7 years ago

@HelenaSabel I know this issue was from a year ago, so if you don't remember, no worries. I've just finished, at last, incorporating the original old Amadis.sch Schematron into an ODD and after a lot of tinkering, the new ODD outputs a good schema (with lots of embedded Schematron). I've just associated it with the Montalvo and Southey TEI files.

So, my question is: shall I incorporate your Schematron file for the feature structures markup into the new ODD?

HelenaSabel commented 7 years ago

If you are up to it, it would be very appreciated. The descriptors still need some work and including them into the documentation may help us with the reviewing process.

ebeshero commented 7 years ago

So I've now got the ODD working and associated with each of the feature structures files, and it looks like it's catching the same expected errors. I left your schema in a comment so you can compare. There's going to be one difference--as I mentioned over in #66 : Now that we're working in a TEI environment, I found I couldn't easily muck with the content model of the element to allow more than one child. We've been outputting two children sometimes to set a note next to main text (when processing Southey). There's a simple solution in TEI: just wrap the two strings in (which appropriately means, collection of values). I'll leave a note on this over on the Tables issue.

ebeshero commented 7 years ago

I'm distilling comments here on classifications and descriptors that we left in @HelenaSabel 's separate Schematron file. In our meeting on May 27, 2017, @setriplette and I reviewed the classification system, and she now thinks we can simplify it to the following basic types:

          omission, addition, compression, mistranslation

As a basic classification principle, we don't need to be marking word-for-word ("literal") translation, b/c this is kind of a "default" setting. Instead, we should concentrate on marking what's WARPED by translation, and Stacey's thinking the above four values are all that's necessary.

With the above four type values, we may no longer need our long list of subtypes. For example, after all, every translation is "cultural" so that doesn't tell us much. "compressed" should be a main type, not a subtype, because this is quantifiable and easy to mark.

Change in clause order is MORE significant to us than small rearrangements of subject-object-verb to subject-verb-object in English. These micro-shifts are so common they aren't worth commenting on. Transpositions of clauses don't need to be specially ("hand") marked on the fs files b/c these show up in the clause numbers. It would be helpful to flag these with XSLT where they occur, and perhaps output a small symbol on the HTML tables to denote a transposed clause unit.

Shifts in voice, similarly, are marked elsewhere via milestone elements (so we don't need to repeat that markup here).

@setriplette and @HelenaSabel : you may want to discuss this before I change the ODD, or let me know if I've understood this right before I go editing the ODD and changing the schema!

ebeshero commented 7 years ago

Reflecting on this, the first three types: omission, addition, and compression, we're already marking and quantifying with XSLT in producing our SVGs, so we could pretty much apply them automatically to the fs.xml files as well. The only one we probably couldn't catch without a human reviewing would be "mistranslation", right?

@setriplette : Would you want to classify mistranslations with subtypes still, or leave them be, and just collect and count them?

setriplette commented 7 years ago

Well, what is the simplest thing to do? I read the whole Southey translation when I was working on my book and though I haven't compared it closely with the Montalvo yet, my guess is that we will find only a few actual translating mistakes. What would "leaving them be" look like in the coding process? I suppose we could tag them in the Southey files somehow.

HelenaSabel commented 7 years ago

As @ebeshero points out, most of the types can be generated automatically. Regarding the mistranslations, I agree with @setriplette : they would become apparent during the stitchery so they could be encoded by adding a @type="mistranslation" to the <anchor/> element.

ebeshero commented 7 years ago

@HelenaSabel That's a great idea, to encode @type="mistranslation" in the Southey files as they're being stitched to Montalvo. (I am going to make a big deal of that stitching/sewing metaphor in our talk, by the way...) ;-) If it's there in the XML code, we can automate that output into the fs.xml files, too. And we can start making SVGs from it as well. @setriplette