ebeshero / Amadis-in-Translation

a project to apply TEI markup to investigate early modern Spanish editions of Amadis de Gaula and their translations into English and French from the 1500s to the early nineteenth century.
http://amadis.newtfire.org
GNU Affero General Public License v3.0
4 stars 6 forks source link

XSLT and Visualization #18

Closed ebeshero closed 8 years ago

ebeshero commented 8 years ago

Issue board for stuff while we're working!

Text copied from comments in another Issue Board:

[ebeshero: ] Okay--so this is the next new thing we need to brainstorm! What do we want to do with these matched up units? I imagine

we will care about any place where there's a gap and Southey skips ahead or backward. we want to count the words in between each start and end anchor, and compare the wordcounts to the matched unit in Montalvo. I was going to do this by counting white spaces. Probably we could start with a great big XML table of the units by chapter, side by side with wordcounts.

We could make another, smaller chart that only highlights points of extreme alteration so we can zoom to those.

Would we want to apply classifications to the kinds of changes Southey is making? We could do that in the Southey XML files by adding some attributes, or we could do that as you're surveying the charts: We'd make the charts in XML so we could add some analytical markup there.

Ahhh, here's where the hard work begins! :-)

We're thinking of two kinds of visualization: 1) Montalvo side-by-side with Southey: made with alignment on-click, so that when you click on a clause unit in a Montalvo chapter, you pull up the matching passage in Southey. And we highlight anything in Southey that doesn't correspond to anything in Montalvo (and vice versa).

2) Charts (and graphs) to help us study the kinds of changes Southey is making to Montalvo. (I say we start by just extracting side-by-side information, giving it to @setriplette to study, and then figuring out what new markup to add if we want to classify kinds of alteration (or decide if we don't). Maybe all that matters for the moment is noting compression points--where Southey compressed Montalvo significantly and highlighting what's missing.

[HelenaSabel: ] I can picture the first type of visualization with a color-coded highlighting as Juxta and other collation tools do. Graphs with the different reasoning behind the gaps by Southey would be a great analytical visualization. I think this kind of attributes that describe the modifications would be easier to add after retrieving and matching the text, and doing the charts.

@setriplette @HelenaSabel

HelenaSabel commented 8 years ago

Dear @ebeshero and @setriplette:

I made some HTML visualizations, but they still need some work. You can find them in the 'html' folder and open them from any browser. If you hover over the Montalvo text, you'll see the equivalent in Southey's highlighted (should it work both ways too?) If you click on any character (bolded), you'll get the description it has on the SI file. Omissions and additions are in red and green; reports in yellow. There is a very important bug in Southey's text: notes and person names get duplicated, but I'll open an issue for it. Let me know what you think!