nlottig94 / Dickinson

This repository houses the code that is used for the production of the Emily Dickinson Project.
http://dickinson16.newtfire.org
2 stars 1 forks source link

Issue with Schematron #7

Open nlottig94 opened 6 years ago

nlottig94 commented 6 years ago

@RJP43 @brookestewart @ebeshero I just found an issue with the following rule in the Dickinson Schematron.

<pattern>
        <rule context="tei:app">
            <let name="wit" value="tokenize(replace(string-join(.//@wit, ' '), '#df1', ''), '\s+')"/>
            <assert test="count($wit) eq count(distinct-values($wit))">
                There should not be any repeating witnesses in a single app element other than the possibility of #df1.
            </assert>
        </rule>
    </pattern>

This rule tells us that there cannot be any repeating witnesses in a single app element other than the possibility of the Dickinson original manuscript. However, in Fascicle 1, Poem 4, there is an instance in a letter to Susan Dickinson where Emily has written a word, crossed it out, and then wrote another word that makes it the same as the original manuscript... The code would then look like this:

<l n="9">And the <app>
                    '<rdg wit="#df1 #lSD">Earth</rdg>
                    <rdg wit="#lSD"><del rend="strikethrough">Water</del></rdg>
                </app>&#8212;they tell me</l>

I have attached the images if anyone has any ideas/help with revisions, I would appreciate it!!

The first image id Dickinson's original manuscript and the second is her letter to Susan Dickinson.

Dickinson's original manuscript--fs104 Dickinson's letter to Susan Dickinson--104lSD

ebeshero commented 6 years ago

@nlottig94 I think I remember the context for your Schematron rule, and it had something to do with the way you want to process and output your witnesses, doesn't it? If you're holding to your rule of validating and processing each witness separately, maybe this isn't a bug after all!

I'd recommend the following in your markup: ``

And the Earth WaterEarth —they tell me

` This would be more accurate, too, with what we can see of the writing events on the page: The word Water was written, then scratched out, and the word Earth written above the line. This way, you'd hold those events together in one` element.

ebeshero commented 6 years ago

@nlottig94 On closer inspection, isn't that word that's deleted "world" (lower case) and not "Water"?

In which case, I'd recommend: ``

And the Earth worldEarth —they tell me

``

ebeshero commented 6 years ago

@nlottig94 Oops--I see I didn't fully correct the tagging. One more time! HERE is really what I recommend, still keeping each witness separate:

``

And the Earth worldEarth —they tell me

``

nlottig94 commented 6 years ago

@ebeshero I like your thinking!!! I didn't know I could place the earth above it!!! That's awesome!! Thank you!

ebeshero commented 6 years ago

@nlottig94 As you're digging back into Dickinson manuscripts, you'll want to spend some time with the TEI Guidelines! This will help: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/CO.html#COEDADD

nlottig94 commented 6 years ago

@ebeshero Thank you!!

nlottig94 commented 6 years ago

Okay @ebeshero I need to figure out how to make the same Schematron rule work for all dfs, not just df1....Do you get that?

ebeshero commented 6 years ago

@nlottig94 Yes—I get it! Sorry for the delay. Look at your variable definition for $wit: just a reminder to everyone of what it’s doing: You are collecting all the witnesses with a string-join() first, and then you are using replace() to find the “df1” and replace it with nothing. Then you are tokenizing the resulting list of all the witnesses except df1. Then you test with the rule to see that you have no repeats.

To expand this rule to permit any df (presumably followed by a digit, right? As in df2 df3 df4) you want a regex pattern, and replace() should be able to take it!