Closed sydb closed 8 years ago
Syd, you're an angel. I'm going to give you some feedback from the ISO/implementation point of view once I move to this area in my project, which is going to be very soon. Thanks for scouting the way and for the fixes above!
Thanks for spotting this. I think there once was (briefly) a <fVal> element, with children vColl, vAlt etc. but when it was removed on the grounds of redundancy clearly this schematron rule wasn't looked at closely enough. The @fVal attribute, as you note, is nothing to do with it : it is used to specify an additional value to be unified with that specified by the content, whatever the content may be. The confusion arose when it was decided to permit a feature containing just a text string with no indication of its type. I believe that the original intention was to say that <f> may contain : any one of the typed value elements, or a combination of them wrapped in e.g. <vColl>, or a non-empty string of characters. However, it is also possible (as 18.9 demonstrates) for a <f> element to be empty: so in fact these schematron rules are just plain wrong. Piotr may wish to correct me!
Fixed the constraintSpec 2016-03-10 in commit #0a529e0a373bd9e23187fefe223c15382a5fe1ca.
Have not dealt with the issues as to whether or not an empty <f>
has to have an @fVal
or not.
@sydb should add a schematron rule mandating that empty <f>
have an @fVal
.
Council mtg: SB to create Schematron rule to enforce “empty <f>
has @fVal
”
Rule added in 195c97b0c3299d0b08d689f0e93813a8a945cbf5. I also needed to alter some prose from FS to match.
And then I looked at the Schematron rules that were already in place for <f>
and asked myself <f>
can be textual, and if it’s textual may have characters outside of Unicode, so multiple <g>
elements should be allowed.) Why not just create a content model in PureODD that enforces them correctly? So I did:
<alternate minOccurs="1" maxOccurs="1">
<macroRef key="macro.xtext"/>
<classRef key="model.featureVal"/>
</alternate>
Note that I did not add the rule for “not both content and @fVal
”, as Council did not address that issue at face-to-face. Given that the Guidelines state that it is permissible to have both (and that the value referenced by @fVal
is to be unified with that contained as content of <f>
), I am presuming we do not want such a rule.
Looking at this again, and in particular at the passage which Syd has removed from the text, I am less confident that this is correct. This unification grammar is tricky stuff. I think the intention was (as originally stated) that an empty <lt;f/> should be legal, and have a particular meaning. The reasoning is probably that the same feature may be used more than once in a structure and you don't want to have to specify its possible values every time it does. I know this is counterintuitive (and looks weird in an XML context) but if you go and read the original ISO spec, I think you'll find that this is what was intended. The subsequent (minority) decision to allow textual content muddies the waters somewhat, as Syd rightly points out.
Agree with Lou I would avoid touching a model too swiftly when based on an underlying potentially elaborate theory.
Envoyé de mon iPhone
Le 15 mai 2016 à 11:01, Lou notifications@github.com a écrit :
Looking at this again, and in particular at the passage which Syd has removed from the text, I am less confident that this is correct. This unification grammar is tricky stuff. I think the intention was (as originally stated) that an empty <lt;f/> should be legal, and have a particular meaning. The reasoning is probably that the same feature may be used more than once in a structure and you don't want to have to specify its possible values every time it does. I know this is counterintuitive (and looks weird in an XML context) but if you go and read the original ISO spec, I think you'll find that this is what was intended. The subsequent (minority) decision to allow textual content muddies the waters somewhat, as Syd rightly points out.
— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub
Just to make sure I’m clear on this …
The scenario in the released Guidelines is contradictory with respect to whether an <f>
element that does not have an @fVal
attribute may also be empty.
<f>
that does not have an @fVal
means, implying that it is allowed<remarks>
in the tagdoc for <f>
say an empty <f>
must have an @fVal
, stating it is not allowedIn Providence (IIRC) we (the TEI Council) decided the tagdoc was correct, an empty and @fVal
-less <f>
makes no sense. Am I correct, @lb42 and @laurentromary, that you are suggesting that we figuratively flipped the coin the wrong way? (In which case the change should have been removing, or better still re-writing, the sentence “If the element is empty then a value must be supplied for the @fVal
attribute.” in the tagdoc of <f>
, yes?)
For the record, I’m OK with either solution. My only argument is with the blatant discrepancy.
The whole issue has to do with re-entrancy. Looking at http://www.tei-c.org/release/doc/tei-p5-doc/fr/html/FS.html#FSVAR more quietly, I see that the implementation does not imply an empty
Re entrance is part of the issue but I think the crucial point is that the possible range of values for a feature is not specified in the XML schema for features as would be the typical XML case. Instead it is specified in the feature system declaration. In which situation an empty f has the specific interpretation given in the text that syd was proposing to delete. So if anything needs changing its not the guidelines prose but the remark in the tagdoc, which misrepresents the intended behaviour.
Agree!
OK, so:
f-has-content-or-fVal
constraint (which enforces “either content or @fVal
”) should be removed@fVal
attribute.” (But what, if anything, should it say instead?)<fs>
element may be empty, but the <f>
element must have (or reference) some content.” should be deleted from prose section FSBII also plan to change the @minOccurs
of the content model to "0"
. It will make absolutely no difference as to which documents are valid and which are not (because no text matches a <textNode>
— about which see #1459), but the meaning — that empty content is allowed — is clearer.
This all sound reasonable?
Pending some thoughts by those who know more then me (@lb42, @bansp, and @laurentromary jump to mind), I have not implemented my summary of 4 days ago yet. I have, however, implemented @lb42’s suggestion that <g>
not be allowed inside <f>
as if this were a corrigible error in commit 46d4b34, push 62839e1.
Briefly: 1: yes 2: yes 3: yes (it need say nothing) 4: no. please leave this sentence alone.
Ummm ... @lb42, I’m confused. You are suggesting that we want to
@fVal
-less <f>
, thus implying it is OK,@fVal
-less <f>
is an error, thus implying it is OK,<f>
must have an @fVal
, thus implying it is OK,This seems like the kind of contradiction we were trying to clean up in the first place.
The parenthetical phrase "or reference" is what does it for me.
But @fVal
is how an empty <f>
references content.[1] That is “the <f>
element must have (or reference) some content” means “the <f>
element must have content (or an @fVal
)”. So I’m still suggesting the sentence be removed from FSBI as per (4), above.
[1] Thus the definition of @fVal
: “references any element which can be used to represent the value of a feature”.
I guess I am just too elliptical. The point is that an empty <f> can "reference" its intended values by means of declarations in the feature system. Perhaps recasting the sentence as "must specify its value either directly as content or by means of the @fVal attribute, or implicitly by reference to a feature system declaration" would help. Or just delete the sentence as you suggest.
Finally resolved (I hope) in a97af87.
The
<constraintSpec>
with an@ident
of"fValConstraints"
(which can be found inf.xml
) is FUBAR. Here it is:Problems I see:
"tei:fVal"
. The TEI does not have an<fVal>
element. I don't think the@fVal
attribute is what is intended, I think the<f>
element is the intended context.@test
(on the<sch:assert>
) checks for the presence of PCDATA (“text content”) usingtext
, i.e. a child<text>
. Certainly this should be testing fortext()
instead. But note that justtext()
is insufficient, because that would consider as content a child text node that had nothing but whitespace. So what is needed is something liketext()[normalize-space(.) ne '']
.<sch:rule>
may never fire, depending on whether the code that extracts the Schematron from the ODD puts the two<sch:rule>
s into a single<sch:pattern>
(asStylesheets/odds/extract-isosch.xsl
does) or into two of them (asStylesheets/odds/teiodds.xsl
does, I think). This is because, within a pattern, only the 1st<sch:rule>
whose@context
matches the current construct is fired.[1]<sch:assert test="not( X )">
seems odd. Why not<sch:report test=" X ">
?count(Y) > 1
is better expressed ascount(Y) gt 1
. (The>
operator, here written>
, tests sequences; thegt
operator tests items. Yes, you’ll get the same answer comparing two sequences of 1 item each as comparing the items directly, but using an item comparator indicates to the reader that they are singular items, which might be helpful in debugging.[2])So I propose instead we use
Beyond all that, the
<remarks>
for<f>
sayI’m not convinced this is true, because although the Guidelines say “but the
<f>
element must have (or reference) some content.”,[3] they also say “The value of an empty<f>
element which also lacks a@fVal
attribute is understood to be …”[4].But if it is true (that an
<f>
must have or reference something) we should be testing for it. Something like the following should do. (And in either case, the Guidelines need to be made consistent.)Furthermore, if it is the case that an
<f>
should not have both content and a@fVal
, then we should add the following.[1] See ISO 19757-3:2006, 3.20 “a rule-context is said to match an information item when that information item has not been matched by any lexically-previous rule context expressions in the same pattern and the information item is one of the information items that the query would specify”. Or better yet, just test it out yourself. For most of us that will be easier than reading the spec.
[2] Since in this case the items being counted are child elements, the other way to express this is just
tei:*[2]
. That is a lot shorter and somewhat sweeter, but I think the intent is less clear. Thoughts?[3] In 18.2 Elementary Feature Structures and the Binary Feature Value
[4] In 18.9 Default Values