FrankensteinVariorum / fv-collation

first-stage collation processing in the Frankenstein Variorum Project. For post processing and Variorum development, see our GitHub organization: https://github.com/FrankensteinVariorum
https://frankensteinvariorum.github.io/fv-collation/
GNU Affero General Public License v3.0
9 stars 2 forks source link

SGA margin insertions out of sequence with flow of text #29

Closed ebeshero closed 6 years ago

ebeshero commented 6 years ago

@Rikkm @raffazizzi Here's a problem we need to solve for collation: Sometimes there are words inserted in one of the ruled margins on the SGA notebook pages. When words or phrases are inserted there, the SGA TEI codes them out of order with the text they're associated with. They're coded as separate <zone> elements from the main text on the page. The main text is all coded first, and then anything in the margins is coded afterwards, regardless of the meaningful flow of the text.

(Raff, I remember you warning me about this, but I wasn't quite prepared for it until I was caught up in the throes of checking where words cross endings of lines and inside and outside zones.)

Example of the problem:

See this page from C57 (0005) on the SGA website

Reading the notebook page from the top left, we see half a word ("guished"), which is connected to the last word fragment on the previous page ("distin"). So we'd logically read

[distin]guished the <del rend="strikethrough">insect</del> insect from the herb,

But if you go hunting for the string "guished the", it shows up way down at the end of the code for this page in a new zone like so:

  <zone type="left_margin" corresp="#c57-0005.01">
    <line><add>guished the </add></line>
  </zone>

totally out of sequence, after the rest of the main text on that page appears. I'm trying to work out a good way to resolve this, and I'll propose a tentative solution I'm trying in the next message block on this issue.

ebeshero commented 6 years ago

Proposed solution

First of all, there are attribute values that do point the way to where these margin insertions are supposed to go in relation to the main text. But given the way collateX reads around XML markup as blobs of angle-bracketed text that we train it to ignore, I don't think it's going to be easy to tell collateX how to jump between attribute values--it's going to want a forward-moving one-directional stream of text as much as we can possibly make it. For our intermediary edition--prepped for collation--I think we need to set these margin zones in sequential order, and that would mean some light editing when this comes up. I think we can handle this by just moving the margin zones into the main text where they fit into the flow of the text. On this page the margin insertions all go at the start or end of a line, and apparently these <zone> elements can legally be siblings of <line> in this code.

<zone type="main">
    <line>......</line>
    <line>.....<metamark xml:id="Here">X</metamark></line
    <zone corresp="#Here" type="left_margin"><line>stuff inserted in margin 
      now in sequence</line></zone>
    <line>with what comes next in the main text.</line>

This strategy works for our Example code of C57 page 0005. I've sort of flagged the zones I've moved with a couple of asterisks below, so you can spot them more quickly. (Sorry, I can't get the code blocks to highlight here.)

<!--2017-09-24 ebb: PROBLEM: the marginal additions on the following page (and presumably elsewhere) are coded at the end of the page. 
PROPOSED SOLUTION, which I'm enacting here, is to move these into their sequential location.
-->
       <!--THE USUAL TOP OF PAGE CODE-->
  <graphic url="http://shelleygodwinarchive.org/images/ox/ms_abinger_c57/ms_abinger_c57-0005.jp2"/>
  <zone rend="bordered" type="pagination"><line>4</line></zone>
  <zone type="library"><line>3</line></zone>
<!--PAGE'S TEXT CONTENTS HERE: -->
  <zone type="main"> **<zone corresp="#c57-0005.01" type="left_margin"><line><add>guished the</add></line></zone>**
    <line><anchor xml:id="c57-0005.01"/><del rend="strikethrough">insect</del> insect from the herb,
      and by de</line>
    <line>grees one herb from <del rend="strikethrough">a</del> another – I found</line>
    <line>that the sparrow uttered none but harsh</line>
    <line>notes <mod>
        <del rend="strikethrough">and</del>
        <del rend="strikethrough"><add place="superlinear">but</add></del>
        <add hand="#pbs" place="superlinear">whilst</add>
      </mod>
      <mod>
        <del rend="strikethrough">that</del>
        <add place="sublinear"><metamark function="insert">^</metamark></add>
        <add place="superlinear">those of</add>
      </mod> the blackbird were sweet</line>
    <line>and enticing. <del rend="strikethrough">I had <unclear>le</unclear></del> one day when I
      was</line>
    <line>oppressed by cold I found a fire that</line>
    <line>had been left by some wandering beg</line>
    <line>gars and was over come with delight<mod>
        <del rend="overwritten">.</del>
        <add place="intralinear"><metamark function="displacement" xml:id="c57-0005.02">X</metamark></add>
      </mod></line>

    **<zone corresp="#c57-0005.02" type="left_margin">**
      <addSpan hand="#pbs" spanTo="#c57-0005.03"/>
      <line><metamark function="displacement">X</metamark></line>
      <line> at the warmth</line>
      <line>which I ex-</line>
      <line>perienced from</line>
      <line>it.</line>
      <anchor xml:id="c57-0005.03"/>
    **</zone>**

    <line>In my joy I thrust my hand into the</line>
    <line>live embers but quickly drew it away <!--mod  spanTo="#c57-0005.04"--><add place="superlinear">with a cry</add>
      <del rend="strikethrough">in</del></line>
    <zone corresp="#c57.0005.05" type="left_margin">
      <line><add>of </add></line>
    </zone>
    <line><anchor xml:id="c57-0005.05"/><anchor xml:id="c57-0005.04"/>pain– How strange, I
      thought, that the</line>
    <line>same cause should at once produce</line>
    <line>such <del rend="strikethrough">delicious and such</del>
      <del rend="strikethrough"><unclear>u</unclear></del>
      <del rend="strikethrough">opp</del> opposite</line>
    <line>effects –I examined the materials of the</line>
    <line>fire, &amp; to my joy found it to be wood –</line>
    <line>I quickly collected some branches but</line>
    <line>they were wet and would not burn.</line>
    <line>I was pained at this and sat still</line>
    <line>watching the o<del rend="strikethrough">p</del>peration of the fire.</line>
    <line>The wet wood I had placed near the heat</line>
    <line><mod>
        <del rend="strikethrough">became dy dry</del>
        <add place="superlinear">dried,</add>
      </mod> and itself <del rend="strikethrough">became</del> became</line>
    <zone corresp="#c57-0005.06" type="left_margin">
      <line><add>inflamed.</add></line>
    </zone>
    <line><mod>
        <del rend="strikethrough">hot</del>
      <anchor xml:id="c57-0005.06"/>
      </mod> I reflected on this, and by touching</line>
    <line>the various branches I discovered the cause</line>
    <line>and busied myself in collecti<mod>
        <del rend="overwritten">on</del>
        <add place="intralinear">ng</add>
      </mod> a great</line>
    <line><mod>
        <del rend="strikethrough">deal</del>
        <add hand="#pbs" place="superlinear">quantity</add>
      </mod> of wood that I might dry it and</line>
    <line>have a <del rend="strikethrough">pel</del> plentiful supply <add hand="#pbs" place="superlinear">o</add>
      <add hand="#pbs" place="intralinear">of fire.</add> When night</line>
    <line>came on and <mod>
        <add place="sublinear"><metamark function="insert">^</metamark></add>
        <add hand="#pbs" place="superlinear">brought with it</add>
      </mod> sleep, I was in the greatest</line>
    <line>fear lest my fire should be extinguished</line>
    <line>I covered it carefully with <del rend="strikethrough">try</del> dry wood</line>
    <line>and leaves &amp; then placed upon that wet</line>
    <line>branches and then spreading my cloak I <del rend="strikethrough">la</del></line>
  </zone>
ebeshero commented 6 years ago

Okay--that works if the margin insertions are coming at the start or end of a line. But what happens if there's a margin insertion indicated for the middle of a line in the main text?

Well, fortunately, we can do this by simply moving the insertion, carefully, to a place where <zone> can sit inside the <line> element from the main text. (It can't go inside a <mod> element in that line, marking a modification, so we need to set it next-door to a related <mod>.)

Here's an example: I made one change to put the marginal insertion of the word "inflamed" in what seems a more logical place in the sequence of editing events on this page, and it seems to work just fine without causing a ruckus with the schema:

 <line>The wet wood I had placed near the heat</line>
    <line><mod>
        <del rend="strikethrough">became dy dry</del>
        <add place="superlinear">dried,</add>
      </mod> and itself <del rend="strikethrough">became</del> became</line>

    <line><mod>
        <del rend="strikethrough">hot</del>
      <anchor xml:id="c57-0005.06"/>
    </mod><zone corresp="#c57-0005.06" type="left_margin">
      <line><add>inflamed.</add></line>
    </zone> I reflected on this, and by touching</line>
    <line>the various branches I discovered the cause</line>

Here the <zone type="left_margin"> gets inserted after a deletion at the start of the line in the main text, and it's not a problem to add it so the text flows in sequence.

These aren't going to be easy to maneuver, but I think we can do it as long as it's clear how we're moving the code around. @Rikkm let me know if you'd like a quick briefing (say on Hangouts of Skype) on how to do this.

ebeshero commented 6 years ago

@Rikkm Just a quick update: I got a rhythm down for moving the margin notes into position and setting the word-fragment markup. The SGA pages on the web make it pretty easy to follow--it's not as bad as I thought to work with this. My edits are all pushed so you can take a look at the opening pages of c57 as a guide for c56.

ebeshero commented 6 years ago

@raffazizzi @Rikkm This morning, I wonder if I can reliably fit these margin annotations in place w XSLT, setting them after the anchors they point to. That might save us some time--and if any turn up slightly out of alignment, we can then adjust them. I'll try it this evening.

ebeshero commented 6 years ago

This issue was resolved by preparing XSLT files to re-locate the zones. This is a "pipeline" of files (that we run in sequence): 1) https://github.com/ebeshero/Pittsburgh_Frankenstein/blob/master/collateXPrep/sga_Notebooks/Id_Trans-sga-MarginZonesP1.xsl

2) https://github.com/ebeshero/Pittsburgh_Frankenstein/blob/master/collateXPrep/sga_Notebooks/Id_Trans-sga-MarginZonesP2.xsl 3) https://github.com/ebeshero/Pittsburgh_Frankenstein/blob/master/collateXPrep/sga_Notebooks/Id_Trans-sga-MarginZonesP3.xsl