zme1 / toscana

A repository to house research and web development for the Lega Toscana project, led by professor Lina Insana (Spring 2018) and professor Lorraine Denman (Fall 2018), and with consultation from members of the DH Advanced Praxis group at the University of Pittsburgh at Greensburg.
http://toscana.newtfire.org
3 stars 1 forks source link

Schematron Rules #7

Closed zme1 closed 6 years ago

zme1 commented 6 years ago

I am beginning to integrate Schematron into my ODD, but with no success. I have looked up examples of Schematron in ODD files, and I followed the steps correctly (to the best of my knowledge). I declared the Schematron namespace in my ODD file, and I re-declared the Schematron namespace in the @xmlns:sch attribute on my <sch:rule> attribute. I have tried several different ways of phrasing the syntax of the rule itself, but none of my rules seem to be functioning. My most recent attempt is pasted below

      <elementSpec ident="persName" mode="change" module="namesdates">
        <constraintSpec ident="persName_surname" scheme="isoschematron">
          <constraint>
            <sch:rule xmlns:sch="http://purl.oclc.org/dsdl/schematron" context="tei:persName">
              <sch:assert test="./child::tei:surname">Every persName element must have a child surname element</sch:assert>
            </sch:rule>
          </constraint>
        </constraintSpec>

I am not 100% confident on the syntax of this rule, but I've tried to apply the same rule in at least 5 or 6 different ways. Even though it's possible, I doubt my problem is in the syntax. I have a feeling that my issue is somewhere else in my code.

ebeshero commented 6 years ago

@zme1 Integrating Schematron into an ODD is tricky work, and the problem is usually establishing the context for the rule. My first thought here is that you don't need to set a new context for this schematron rule--that it's possible this is looking for a child tei:persName of the element you're defining in the elementSpec. Let me look at a couple of my working ODDs with Schematron to see how I handled this...

ebeshero commented 6 years ago

@zme1 Ah! I see the Schematron rule isn't in the ODD here, and that's probably because you want the ODD on your master branch always to be stable and working and available to your team. (That's a good practice). As you're experimenting with new code, though, you may want to work in a branch on your GitHub, a space apart from your team. (You can read more about this on Becca's tutorial and on the git-scm site here for a good visualization of the workflow. ) This is basically what I do when I pull in your code and tinker with it--and then I send you a pull request for you to review my work and see if you want to merge it with the stable code on your master branch. Branching would be a a good topic for a Tues. evening praxis group meeting one of these days.

zme1 commented 6 years ago

@ebeshero I'll look into branching today (I must admit, I've never done it before but this seems like a good time to learn!). So you were unable to see the code in my ODD file? I am sure that I pushed it yesterday afternoon...

ebeshero commented 6 years ago

@zme1 Oops! My mistake--I missed the lines as I was skimming the code and noted your comment about needing to figure out the Schematron instead. I do see them now that I look specifically for persName...I'm cross-checking against one of my ODDs, and I'm nearly positive that context is the problem...

ebeshero commented 6 years ago

@zme1 Yes, I'm pretty sure the context is the problem. Try changing the value of @context on your sch:rule element, so that it's trained on tei:surname instead of tei:persName, because you're already standing on persName in the context of your elementSpec.

Here's an example of a schematron rule defined inside an ODD elementSpec from one of my projects, where I'm setting a rule for a sequence of attribute values: I don't want to see the same attribute value appear twice in a row. Notice how I set the schematron context on the attribute attached to the current element:

<elementSpec module="linking" ident="anchor" mode="change">
          <classes mode="change">
            <memberOf key="att.global" mode="delete"/>
            <memberOf key="att.global.linking" mode="add"/>
            <memberOf key="att.global.rendition" mode="delete"/>
          </classes>
          <constraintSpec ident="anchor-distinct-anas" scheme="isoschematron">
            <constraint>
              <sch:report test="@ana = preceding-sibling::tei:anchor[1]/@ana">
                The @ana must NOT be the same on two subsequent anchors!
              </sch:report>   
            </constraint>
          </constraintSpec>
          <attList>
            <attDef ident="ana" mode="replace" usage="req">
              <datatype><rng:text/></datatype>
              <valList type="closed">
                <valItem ident="start"/>
                <valItem ident="end"/>
                <valItem ident="intStart"/>
                <valItem ident="intEnd"/>
              </valList>
            </attDef>
        </elementSpec>
zme1 commented 6 years ago

@ebeshero Ah, a classic XSLT-esque coding bug... So, when writing Schematron rules, then, is it correct to have the rule nested where it is? That is, since I'm zero-ing in on the <surname> element, it's best to do so in its parent element?

ebeshero commented 6 years ago

@zme1 Well, it's specifically in this contextualized environment of ODD that you need to be a little extra careful: the issue is that the elementSpec sets a context, and if you write schematron rules there, they fit inside that context. Another solution, though, is to pull the Schematron rules outside of the elementSpec elements (which is what I do most of the time). If you do that, the context you set could then be tei:persName[tei:surname], for example.

Let me point you to the gigantic ODD for the Amadis project that I wrote a year ago in which I have examples of both settings for Schematron rules in ODD. If you search the file for the comment tag: <!-- SCHEMATRON OUTSIDE ELEMENT SPECS--> you'll see some examples.

zme1 commented 6 years ago

@ebeshero I see the comment at the bottom of the file on inserting Schematron ... how convenient! I'll write up some rules according to these examples and let you know if issues persist. Thank you, as always.

zme1 commented 6 years ago

@ebeshero I am still unable to generate any rules. The one I'm trying to draft is straightforward, -- every <persName> element must have a <surname> child element. The code is below:

      <constraintSpec ident="persName_surname" scheme="isoschematron">
        <constraint>
          <sch:rule context="tei:persName">
            <sch:assert test="child::tei:surname">Every persName element must have a child surname element!</sch:assert>
          </sch:rule>
        </constraint>
      </constraintSpec>

The machine still is not picking up Schematron rules, and this being written outside the <elementSpec> element does not seem to have helped. I declared the Schematron namespace, declared TEI elements, and I'm still stumped..

ebeshero commented 6 years ago

@zme1 Ohhh! I think this is a different problem: If you do not see any schematron rules at all in your output RNG file, you need to make sure when you associate your schema that it is setting that extra line to deal with Schematron rules...Take a look at that, and I'll look here at the code you've pushed (I hadn't tested it yet).

zme1 commented 6 years ago

@ebeshero Ah! That may very well be it; there is no Schematron namespace declaration in my TEI file, if that's what you mean..

ebeshero commented 6 years ago

@zme1 Aha! You are indeed lacking the requisite namespace lines in your ODD file. If you open a new ODD in <oXygen/>, check out the purple schema lines at the top and just paste those into your file...

zme1 commented 6 years ago

Mystery solved!!!!!!!!!

ebeshero commented 6 years ago

@zme1 Did you get it to work? I'm tinkering with your code here, and haven't seen the Schematron constraints yet--just looking at the output HTML documentation: When this works, you should see a list of "Constraints" after the alphabetical listing of element Specs...and the Constraints should be all the Schematron rules you've entered.

zme1 commented 6 years ago

I spoke too soon. I excitedly thought I'd easily see the problem now, but it doesn't seem like I am effectively working out the problem... Ugh.

ebeshero commented 6 years ago

We'll figure it out...this is tricky work. It helps that I've got some running examples to check. Here's something else regarding context: I see that when I set a Schematron rule inside an <elementSpec>, I do not actually set an <sch:rule> element, because the context is already established. If memory serves, I think this was crucial for my getting things to work.

ebeshero commented 6 years ago

@zme1 All is well! I needed to follow my own advice and stop looking at outdated outputs from ODD. The TEI has updated its formatting for ODD-generated HTML documentation, and your Schematron rule processes and validates your XML just fine.

The key here is to open up your XML file to be validated, and re-associate its schema line, being sure to mark the checkbox next to the option for embedded Schematron rules. I'm going to push the successful code to the branch. There are some more tricks with embedding Schematron rules in elementSpecs to discuss...

zme1 commented 6 years ago

@ebeshero Hopefully I'll be able to spend some time with them today, then! This whole Schematron episode, in addition to drafting of the wiki, did set me back in regard to personography. I may be able to spend a bit of time developing it today, but I'm not certain I'll have a product by Tuesday's meeting. I'll keep you updated, though.

ebeshero commented 6 years ago

@zme1 No worries--even if you just have a start on your personography, that should be enough for our Tuesday meeting. Meanwhile here's a pull request to look at: https://github.com/zme1/toscana/pull/9/commits/496932e16aea9da6ce09bfe997b090d67ba2af42 This ODD works to generate a schema with good Schematron rules. My problem with examining the output HTML is that I assumed all the Schematron rules (wherever defined) came out at the end of the document the way they used to a year ago. TEI has updated its Stylesheets that output ODD documentation so that project-defined schematron rules for specific elements come out within the element specs (I didn't look in the right place). When I associate this schema, I do need to change the current schema association lines on your XML file so that they also validate with the embedded Schematron rules.

Unfortunately, I think my pull request may not easily be merged since you've been making other kinds of changes to your ODD. Feel free to just read and apply my code and decline my pull request if that's easiest! Or you can try following GitHub's instructions for resolving the merge by hand--a good learning experience in its own right.

zme1 commented 6 years ago

@ebeshero Schematron is up and running! The merge conflicts were not anything too hairy. It's always good to keep those faculties operational...

ebeshero commented 6 years ago

@zme1 Huzzah! Glad you've got this working!

zme1 commented 6 years ago

@ebeshero Alas, we only half solved the problem. The Schematron runs fine from within the <elementSpec> element, but when it is relocated as a sibling element, even if I follow the instructions you gave, the rule does not enforce... I just pushed the ODD reflecting that. If it were to police correctly, it would flag one of the <surname> elements that is actually a descendant of the <persName> (which I corrected in the Schematron rule that I commented out in the <elementSpec> that contains <persName> ).

ebeshero commented 6 years ago

@zme1 Interesting...It sounds like you had to specify the child:: axis (which is fine, but not what we're accustomed to doing in Schematron or XSLT)... As for the constraintSpec rule not working...I don't have a ready explanation, so let's take a closer look...

zme1 commented 6 years ago

@ebeshero The rule actually applied fine; the issue was that in a few instances the <surname> element is not directly contained by the <persName> element. In this case, I just needed to say that there is to be a <surname> somewhere, whether it be a child or deeper down on the hierarchy. I am only persisting in trying to use the sibling-element format for Schematron rules because it seems like iI can more easily isolate specific subsets of elements than I could when working within the <elementSpec>. E.g. I want to write Schematron that expects a <persName> element with a @role="proposer" and another with a @role="supporter" for every <seg> of @type="proposal," and also do the same with all other interactions we label so that group members know exactly which attributes align with which across axes...

ebeshero commented 6 years ago

@zme1 It's a good idea, and as you've seen, I do that myself. (I've got some rules written for any element that is holding X attribute, so I need a Schematron (X-Path-based) rule context): *[@ana] for example... I'm running a little test on the Amadis ODD to see if, when I run it now in the current version of oXygen (with the current TEI stylesheets loaded in) whether it renders the same schematron "Constraints" section that it used to for my non-elementSpec-Schematron rules...

ebeshero commented 6 years ago

@zme1 ...and the answer is, yes, yes it does render all the Schematron rules as I'd expect, so I think there's nothing wrong with the transformations we're running...

ebeshero commented 6 years ago

@zme1 Let's see...the one thing I have in my non-elementSpec Schematron rules is an <sch:pattern> element inside the <constraint> element. I wonder if that's necessary to get the rule to be properly configured and embedded in the output RNG XML syntax...Let's give it a try...

ebeshero commented 6 years ago

@zme1 ...Nope, that's not it, and I only use that pattern syntax on some of my rules, not all...

ebeshero commented 6 years ago

@zme1 AHA! Look at where your <schemaSpec> element closes...It should be wrapping the entire set of rules, and close just before the <body> closes. Why don't you try just changing that on your end, run it, and let me know if that solves it.

ebeshero commented 6 years ago

@zme1 So, the basic structure here is:

<schemaSpec>
     <moduleRef>....</moduleRef>
      ....
     <elementSpec>....</elementSpec>
     <constraintSpec>....</constraintSpec>
</schemaSpec>
ebeshero commented 6 years ago

@zme1 Confirming that this solves the problem on my end--at least I now finally see the rule listed in a "Constraints" section in your HTML documentation. I'd rather you fixed this on your master branch b/c I made a bunch of bumbling alterations on your file in testing this.

zme1 commented 6 years ago

@ebeshero I just got back from dinner; stay tuned....

zme1 commented 6 years ago

Huzzah!!!!!!!

zme1 commented 6 years ago

What a journey that was... @ebeshero

ebeshero commented 6 years ago

@zme1 Now you know, from personal experience, why ODDs are so bloody difficult to debug!

zme1 commented 6 years ago

@ebeshero It's now Schematron city! I'll push what I have so far and keep chugging away. Schematron is exactly what we need for novice coders helping with the mark-up...

zme1 commented 6 years ago

I forgot to switch back to the master branch from yours... oops..

zme1 commented 6 years ago

Ok, the code is up! Thanks a million, @ebeshero.