Closed TEITechnicalCouncil closed 9 years ago
This issue was originally assigned to SF user: louburnard Current user is: lb42
Logged In: YES user_id=686243
The value= attribute of <interp> and <span> is one of the thornier ones to deal with in our quest to rid TEI of "content textual attributes". I see two possible ways of thinking about the value of value=. First, we might think of it as being used to provide, directly, the interpretation of the indicated element(s). Second, it might be thought of as a key which is used to look up the actual interpretation, possibly in the human readers' mind.
If the first is the case, it obviously makes sense to make the value= attribute a child of <interp> or <span> instead. Possibilities for the child include:
* Zero or more characters, without markup. Does not satisfy Andreas's request for allowing markup inside the value.
* Paragraph-type content (characters plus phrase level elements, i.e. the P5 equivalent of %paraContent;).
* A new element, e.g. <value>, created for this purpose. Besides very clear semantics, this has the advantage that if <value> is allowed to repeat, the elements <interpGrp> and <spanGrp> could be dropped, in favor of permitting <interp> and <span> to contain themselves. The content of <value> would be permitted to contain phrase-level markup, giving it more expressive power than an attribute, and satisfying Andreas's request.
* A <p> (or perhaps <seg> or <ab>), or perhaps one or more of them. This has the advantage of using an already existing element.
In all of these cases the intent is that the interpretation is written out long hand, as it were. So one can imagine rather than just "foreshadowing", an interpretation with more detail: "foreshadowing <name key="LS">Luke</name>'s return to <place key="YP">Dagobah</place> in <rs key="SWE6RotJ">episode 6</rs>."
One can easily imagine, of course, that folks would use this for other things, e.g. for commentary on the interpretation itself, or for explanations of why other interpretations were dismissed, etc. It is not at all clear to me whether such use would be a good thing or a bad thing, an argument in favor or against this permissiveness.
If the second possibility (that the value of value= should be thought of as a key used to look up the interpretation) is the case, then the value of value= (e.g., "aftermath") should not be thought of as a complete interpretation itself, but rather as a key with which the user can determine the interpretation, e.g. via a computer table look-up or by knowing what the word means in a natural language (sort of a table look-up in the brain, as it were). The actual interpretation could be spelled out somewhere ("the section of the narative following the climax which describes the negative results of the protaganists' actions" -- OK, I'm not a literature scholar, you get the idea) or left open to interpretation. In the former case, the value itself is just a string; it may be purposefully designed to resemble a word in some natural language, but as far as computer systems are concerned could just as well be a random sequence. In the latter case some mnemonic string is used to jog a reader's memory, as it were. In neither of these cases does using internal markup seem to be necessary of even make sense.
On the other hand, if value= is just a key into a (computer or mental) table lookup, why not call it key=? For that matter, why not permit it to actually point directly to the full interpretation (by being a URI)? In which case the full itnerpretation could contain any kind of markup you like, i.e. might even be in a markup language other than TEI.
Original comment by: @sydb
Logged In: YES user_id=1021146
I suggest that the content of <interp> should be the macro.glossSeq, or some subset thereof.
Also that <interp> probably wants to join <note> in my proposed tei.pervasive class
Original comment by: @lb42
Original comment by: @lb42
Logged In: YES user_id=950793
Personally, I am in favour of Syd Bauman's first, direct way of providing a value. The second, indirect approach would create another indirection: say, from an "ana" reference to an <interp> element and from the latter's "value" to a look-up table.
Original comment by: @nolda
Original comment by: @sydb
Logged In: YES user_id=612078
Syd suggests there is a dichotomy here in the meaning of value between its use as a key and its use for containing the interpretation as a whole.
I'd like to suggest that these two aren't necessarily mutually incompatible, and that what we should do is cater for both of them. Retain the 'value' attribute, perhaps renaming it 'key', and have the content of <interp> contain the tei.pervasive note-like interpretation. To use Syd's example:
<interp id="LukeDago1" resp="SB" key="foreshadow"> This foreshadows <name key="LS">Luke</name>'s return to <place key="YP">Dagobah</place> in <rs key="SWE6RotJ">episode 6</rs>. </interp>
This allows markup in the interpretation, and a key by which
all instances of foreshadowing (or whatever) can easily be
pulled out. I certainly wouldn't want to do away with a
key-like attribute. (However, it could be argued that @n
/
@type
could be misappropriated for this.) The example in P4
15.3 uses @id
/@ana
for this because, I believe, it wants to
highlight that it "does not itself indicate which passage of
text is being interpreted; the same interpretive structures
can thus be associated with many passages of the text".
This does not mean that someone should not use it to provide
more specific interpretations rather than a general "there
is some foreshadowing here".
The element does seem to be part of those which are "descriptive or identifying elements which characterize and object[ify]", and so unless I'm misunderstanding it, Lou's suggestion of macro.glossSeq makes sense.
Of course, if @value
was renamed @key
, then that might
create some possible confusion with the way @key
is used
elsewhere, esp. in tei.entries.
This @key
-like attribute could be provided on interGrp, but
then would this necessitate that the <interp> all be grouped
together, rather than allowing for the possibility of
dynamically grouping interpretations together in some way?
My two pence,
-James
Original comment by: @jamescummings
Logged In: YES user_id=929066
I am in general agreement that <interp> should not be empty and should
contain markup. This would allow the interp element to be used for
extended interpretations, rather than just simple ones like
@value
="introduction", @value
="conflict", and @value
="climax".
If the purpose of making <interp> non-empty is to allow for more detailed, extended interpretations, then the content should allow multiple paragraphs. An interpretation, even of a short span of text, could easily require more than one paragraph.
With <interp> non-empty, we can lose @value
altogether. The content of
<interp> will be come the "value" of the interpretation, and so there is no
longer a need for @value
.
James proposes a possible @key
attribute, " by which all instances of
foreshadowing (or whatever) can easily be pulled out," but also suggests
that "@n
/ @type
could be misappropriated for this." I think @n
and @type
can be useful here without being misappropriated. Interpretations don't
seem to me to be uniquely identifiable things, like people, places, etc. for
which @key
is more commonly used (<name>, <persName>,
<placeName>, etc.). And neither is the sort key use of @key
in dictionary
elements relevant here. The @value
examples used in the p4 guidelines
("introduction," "conflict," "climax," "revenge," etc.) seem to be generic
names or types of interpretations, so @n
and @type
seem very suitable.
Original comment by: @johnwalsh
Logged In: YES user_id=686243
[Note: I am presuming that everything we say here about the content of <interp> applies equally well to <span>. Correct me (quickly) if I'm wrong or you disagree.]
So, it seems that Lou Burnard, James Cummings, and John Walsh all agree with Andreas Nolda (the OP) that <interp> should have content. (Although some want to keep a "short name of the interpretation" attribute as well.) But the question (which I asked the Council to consider this week) is what should the content model of <interp> be?
Lou, can you explain why you think one or more of the members of macro.glossSeq (altIdent?, equiv*, gloss?, desc?) would be appropriate? Given the desire that the content of <interp> be a direct description, I'm not sure that either <altIdent> or <equiv> make sense whatsoever. <gloss> and <desc> make some sense, although <gloss> is a bit of a stretch, and has only phrase-level content. That leaves <desc>, which has the mild disadvantage that we'd need to realign its semantics a bit. And, as has been pointed out here, it is reasonable to believe that more than one paragraph level thing would be needed, so if we were to use <desc> instead of <p>, it (the content model of <interp>) should probably be ( desc+ ), no?
Also, with respect to where it goes, I don't know the details of your proposed tei.pervasive class, but I'm guessing it would be where things that can go in lots of places, but not just anywhere (like tei.Incl) would go. In which case, why would <interp> need to be in it? It doesn't need to be permissible in lots of places. Since the ana= attribute of elements in the text point to <interp>, it could be anywhere (including in a different file); since the from=, to=, (and interp=) attributes of <span> point to an element or elements in the text, it could be anywhere. Seems to me the Guidelines should simply say where they're supposed to be, period.
James suggests a key= for <interp>. John suggests that n= and type= are suitable for this purpose. I like the idea of retaining an attribute value for this purpose. I think key= is probably not quite right, and that n= is simply a bad idea -- this is not a label. But type= makes a lot of sense. My only concern is that if we appropriate type= to be used for a general classification of an interpretation (where the content is used for a detailed description of it), are we robbing people of the ability to use type= of <interp> for some other useful purpose? I.e., is there reason to believe that users will want both type= and that-which-used-to-be-value= as separate attributes? I'm inclined to think not, but there may be counter examples ...
Original comment by: @sydb
Logged In: YES user_id=1124399
I still need more time to think about some issues, but thought that it might be better to break the silence and to share what I thought over by now. I will not take your time repeating already expressed positions that I share, but will simply try to recap my opinion:
- <interp> should not be an empty element and should have content.
- I would be more comfortable with key= being used in cases already mentioned by John W., not in the <interp> content model (type= or value= are much more appropriate for that).
- value= vs. type= and whether to allow both of them in the content model? I couldnt think about any examples why one would need both atts and suggest to keep type= for this purpose (if there is a question of choosing one).
- id= and resp= remain in the content model.
- I didnt completely understand Lous suggestion why the
<interp> content model should be macro.glossSeq, though it
sounds quite interesting. According to the Guidelines,
[macro.glossSeq] defines a sequence of descriptive or
identifying elements which characterize and object, while
<interp> is described to be one of the simplest mechanisms
for attaching analytic notes [] to particular passages of
text and associating simple analyses and interpretations
with text elements. Close match, ah?. On the other hand, as
the element <interp> being a description on its own, I would
consider ( p+ ) more appropriate for its content, instead of
(desc+ ) where <desc> contains a brief description of the
purpose and application for an element, attribute, or attribute
value. Isn't it too close of a match?
Original comment by: natashasmith
Logged In: YES user_id=1021146
My suggestion of macro.glossSeq as content for <interp> was just based on the observation that <interp> is a metadata kind of an element, which one might want to gloss, equivalence, or describe in the same way as (e.g.) a TEI class or an element.
On the question of using TYPE or VALUE or KEY, I think consistency is desirable. TYPE is usually used in the TEI to indicate a broad classification of some sort (as for example on divisions or lists); VALUE or KEY to indicate some unique or semi-unique value. So I would expect to see things like <interp type="narrative" value="resolution"/> or <interp type="morphosyntactic-class" value="nounSingular"/>
I am agnostic about James's question as to whether we should go on supporting use of value (or key) to supply a code for the intended contents. To slightly muddy the water, I have just remembered that P4 has examples of hierarchically nested <interpGrp>s -- for example, <interpGrp type="morphosyntactic-class" value="nominal"> <interpGrp type="number"> <interp value="singular"/> <interp value="plural"/> </interpGrp> <interpGrp type="properNess"> <interp value="common"/> <interp value="proper"/> </interpGrp> </interpGrp>
Here the advantages of using the attribute value/single token approach seem evident.
Original comment by: @lb42
Original comment by: @lb42
Logged In: YES user_id=686243
I've just re-read this artifact, and am thinking that the following probably makes the most sense:
element interp { tei.global.attributes, attribute value { datatype.Code }?, p+ }
That is, an optional value= attribute and content of zero or more <p>s. It should be a syntactic error to specify neither value= nor any <p> content, but ODD can't enforce this. It should be syntactically valid to specify both, but it would be a semantic error if value= were, say, "conclusion", but the <p> content discussed something entirely different.
The "datatype.Code" means that the value= attribute is intended to be restricted by the user to a set of discrete codes, but TEI does not dictate what those codes should be.
Original comment by: @sydb
Logged In: YES user_id=222320
I completely agree with Syd's suggestion. p+ seems to be the most natural content model to me, and the attributes also make sense as discussed. If somebody needs "key" for specific purposes, that c/should be added as a local extension.
Christian
Original comment by: @cwittern
Logged In: YES user_id=1021146
Sorry, but I disagree with Syd's proposed content model, which would require at least one para within an <interp>. Even if amended to p* (which is probably what is intended), I don't think it's appropriate for a structured element like interp to contain just prose (which can, of course, have all sorts of other things within it -- such as <interp>s!). It should be treated in the same way as all other formerly empty elements to which we are now giving conrtent: we should be trying to identify elements with semantics as close as possible to those of the original attributes. The old <interp> element did not have a "write a short essay here" attribute, but specifically proposed attributes for responsibility, type of interpretation, and "value" of interpretation. Andreas' desire to include markup in the "value" attribute is reasonable enough; extending the meaning of this element to make it support general purpose descriptive prose, i.e. to function as a <note>, seems to me to be a diversion from what this element is designed for: i.e. the definition of fairly precise interpretive catagoeries that can be linked to from a text and/or grouped together hierarchically.
My proposal is that the content model should be (desc, equiv*)
This gives us consistency with several other things (e.g. value lists, element specs) which combine a short prose description with zero or more pointers to equivalent components in other ontologies.
AsI said before, it's possible that we might want other things from the macro.glossSeq , but that's the minimum I see as necessary.
Original comment by: @lb42
Logged In: YES user_id=686243
Lou is correct, I meant "p*", not "p+", as my description confirms. In any case, I agree with Lou that it would be better to have something somewhat more specific than a bag of prose. <desc> is a good candidate. The problem with <desc> is that it is part of the tagset for tagset documentation & extension, not the tagset for simple analysis. Do we want to put <desc> in the core? I think I'd like <equiv> if anyone could give me a real explanation of what it is. :-) In haste
Original comment by: @sydb
Logged In: NO
<desc> is in the core, so no problem using that.
Original comment by: nobody
Logged In: YES user_id=686243
Good point thank you, whoever that was; I stand corrected, <desc> is in the core. We'd have to tweak its semantics a bit, but I'm in favor. I like Lou's desc, equiv better than my p even though it doesn't really solve Lou's complaint that it's "just prose", as <desc> can contain the same set of stuff as <p>. But at least there's only one of 'em, and the sematics are (or at least will be :-) more precise.
Original comment by: @sydb
Logged In: YES user_id=1148190
Lou's point that <interp> has to be clearly more than just an opportunity for annotation is a good one. <interp> should be a way of applying some sort of interpretive scheme, i.e. an analysis in which the group of <interp> elements in themselves might tell you something about the nature of the analysis. It should be structured and should not suggest that this is a place for a prose essay. <desc> and <equiv> seem different from what Andreas originally had in mind, namely a set of values (not semantically different from the value attribute). I think we still need a <value> element to carry the actual interpretive item (that is, the bit of the overall interpretive which is being applied at a given point in the text). This would give us: <interp> <value>The value of the interpretation (would have been value= before)</value> <desc>An optional, more verbose explanation of what this value means or how it fits into the interpretive scheme</desc> <equiv>Optionally, the equivalent of this interpretive chunk in some other scheme(s)</equiv> </interp>
Original comment by: @juliaflanders
Logged In: YES user_id=686243
Julia's post and the opportunity to see some of OP's actual use cases has convinced me that we are slipping down a slope we should have avoided in the first place. We are not trying to make the tagset for simple analysis complicated, we should be trying to keep it simple, but to remove the restriction that the "value" of the interpretation be expressed in an attribute value, in case, e.g., it contains non-Unicode characters (not likely) or is expressed in a foreign language or contains a persons' name or some such. I am now thinking the content of <interp> (and <span>) should be macro.paraContent. Simple, straightforward; does the job; easy to implement, easy to understand. Projects that know in advance they are not going to use encoding in the value of <interp> could change its content from "macro.paraContent" to "text", or better yet to a closed value list, to get the benefits of tighter validation. This does, however, permit lots of pointless elements inside <interp>. It also permits <interp>s and <span>s (and <interpGrp>s and <spanGrp>s) inside <interp>, which some might consider a good thing. Perhaps the content should really be what is called for: ( text | g | foreign )*
Original comment by: @sydb
Logged In: YES user_id=1021146
I have changed the content to macro.paraContent for the moment, as being the least worst of the various alternatives proposed below, and in step with EDW90. I've removed the value attribute too.
Next stop should be to do much the same to span, but maybe that needs more thought...
Original comment by: @lb42
Original comment by: @lb42
Please make <interp> non-empty, thereby allowing for specifying 'values' containing additional markup.
Original comment by: @nolda