TEIC / TEI

The Text Encoding Initiative Guidelines
https://www.tei-c.org
Other
278 stars 88 forks source link

teidata.pointer equivalent to move/@where #1769

Closed joeytakeda closed 4 years ago

joeytakeda commented 6 years ago

The Map of Early Modern London is continuing to encode mayoral pageants, which took place across a number of places throughout London. These documents detail both the performances at particular sites as well as how the entire show moved from one place to another. We would like to be able to encode these movements using the <move> element and link these to our database of places, but currently move/@where is defined as teidata.word. Is there an equivalent to denote a particular place defined by an entity? For example:

<move where="locations.xml#place1"/>

Can there be a way to denote a place via @where that is a pointer (note that event/@where is a teidata.pointer and requires pointing to a <place> element)? I understand the utility of move/@where being teidata.word, so could there be a new attribute--maybe something like @wherePtr --that allowed a <move> to point to a defined place?

lb42 commented 6 years ago

Is it the case that your @where attributes will always take a URL value? If so, you could just redefine @where in your ODD with a teidata.pointer value. This would mean your texts are still unquestionably TEI conformant -- since teidata.pointer values are a subset of teidata.word values -- and it would be obvious to any user of your data what you were doing. Adding a new attribute would also work, but seems like more work/potential confusion.

joeytakeda commented 6 years ago

Defining @where as teidata.pointer does make sense, but there seems to be a precedent for having attribute pairs where one is either word/text/name and the other is explicitly a pointer: @ed and @edRef; @lemma and @lemmaRef; @scribe and @scribeRef; @script and @scriptRef.

I would like to request formally for an @whereRef alongside @where. Adding @whereRef isn't out of line with the rest of the Guidelines and it adds better functionality to the <move> element; for example, @whereRef could point to an encoded segment of a blocking diagram or to a <place> element.

martindholmes commented 6 years ago

@lb42 using teidata.word to store pointers is OK within the project itself, but it isn't great for interchange; it would be difficult for a downstream user to determine whether something is a word or a pointer, unless an obvious protocol prefix is there (which it wouldn't be in this case -- just a private URI scheme).

lb42 commented 6 years ago

@martindholmes I think you are misunderstanding my point: a "downstream user" (whatever that is) should look at the ODD for the project to determine how an attribute's value should be interpreted. My suggestion is to make explicit there that your @where s are pointers.

martindholmes commented 6 years ago

@lb42 You have more confidence in the willingness of future users of a file to investigate its background than me. :-) And the "downstream user" may not be human anyway.

lb42 commented 6 years ago

Well, if we are not allowed to assume that future users of a file might look at its associated schema, why are we bothering to provide schemas at all?

ebeshero commented 6 years ago

This seems to be parsing things rather finely. I've got a question about the nature of subsets: @lb42 mentions that " teidata.pointer values are a subset of teidata.word values". What issues are there with interchange when we refine datatypes of attribute values to be a subset classification? I'm not sure I'm following why the refining of datatype in the schema is problematic for interchange. If the downstream user is a machine, I suppose it's a problem of interoperability rather than interchange, but is it really a problem to define this in an ODD spec?

lb42 commented 6 years ago

I am not sure that I understand what Elisa is saying. All I am saying is that a value (such as "#wibble" or "http://readthebleedingdocumentation.com") which would be accepted by a schema as valid for an attribute declared as having teidata.pointer content would ALSO be accepted as valid if that attribute were declared as teidata.word. The reverse is not the case, of course. Hence, redefining an attribute currently defined as taking the latter so that it requires the former does not result in a non-conformant schema, but does allow for more precise validation, which is presumably the goal of making the suggestion. Of course, I may be the only person left on the planet who cares about conformance, but that's another story.

joeytakeda commented 6 years ago

Why, then, is there a need for @ed and @edRef if a pointer is a subset of a word? My understanding here is that if we want to be as truthful as possible, we could say something like so:

<move where="UpL" whereRef="stage.xml#upperLeft"/>

Where @where is a code defined by the text and @whereRef is a pointer to a representation of what the editor believes corresponds to.

ebeshero commented 6 years ago

@lb42 I raised the question because you pointed out that teidata.pointer is a subset of teidata.word, and also because something about this came up in our discussions of Roma JS a little while ago--there's something tricky about having to remove a parent class to make a subclass available in the ODD. (I might be misremembering this: @raffazizzi and @jamescummings will remember). Anyway, this ticket seems potentially relevant to a larger discussion, and I'd like a better understanding of the relationships of subclasses to classes. My sense of it follows yours, @lb42 , that working with a subclass is a good way to uphold TEI conformance, but I'm also aware of some complexities in ODDs for changing an attribute datatype from a class to a subclass. Anyway, why do we have these attributes with "Ref" appendages on them?

lb42 commented 6 years ago

You may recall that I started my intervention in this thread by asking whether you wanted your @where attributes ALWAYS to have a URL value. The case you cite (having both an inexplicable coded value for something and a pointer value for it) is a typically TEI piece of fence straddling (or maybe a good compromise, depending on how you feel about having to support more than one way of doing something)

lb42 commented 6 years ago

@ebeshero : I don't think this is anything to do with class. Not something you'd expect a brit to say, but life's full of surprises. And we have these twinned attributes (foo and fooRef) because a room full of scholars will always have some who want to say foo="myMeaninglessCode" and some who want to say fooRef="http://mywebpage.com#meaninglesscode" and we like a quiet life.

ebeshero commented 6 years ago

@lb42 Sorry! Of course--it's not to do with classes but datatype classifications...How do we understand subclassifications of datatypes, then? Perhaps there's no trickiness there for ODD definitions.

ebeshero commented 6 years ago

@joeytakeda Pairing attributes makes sense, as you say, but we do/should try to keep an eye reducing duplications of attributes that mean the same thing--and maybe as @lb42 says, there's not really a good reason to have those other attributes, when a simple @ref might do just as well?

ebeshero commented 6 years ago

@lb42 @joeytakeda @martindholmes I think I'm confused by the relationship between datatypes on attributes and attribute classes. (I don't usually worry about this in my local ODDs, but if we want to uphold TEI conformance, does redefining a datatype on an attribute shake up the attribute class definition in a way we shouldn't be doing?) It seems to me that the attribute class memberships of the element <where> could be relevant here for defining its elementSpec, and how its attributes are expected to work together: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-move.html

I notice that @perf is the attribute here expected to take the teidata.pointer type. Isn't that the better attribute to be using anyway, since you want an attribute to point to specific performances by location?

martindholmes commented 6 years ago

@ebeshero @perf has a completely different purpose; it identifies a specific performance in which this movement occurred. @where specifies where the performer[s] moved to.

ebeshero commented 6 years ago

@martindholmes For that matter, it seems like MOEML has adapted <move> to describe movements of productions rather than its specified use in stage directions. I'm okay with it to the extent that "All the world's a stage", and the MOEML project's definition of movement of productions might be kind of like defining stage directions in a script, but...if @where is specifically about directional movement across the stage, @perf seems to be designating which performance is associated with that directional movement. Perhaps, though, I'm not understanding the nature of the adaptation to the project.

ebeshero commented 6 years ago

To be clear, what prompted my suggestion of @perf was the element spec itself and @joeytakeda 's explanation: "These documents detail both the performances at particular sites as well as how the entire show moved from one place to another." So, if a mayoral pageant takes place in a specific way in a specific place, the location might seem to define the distinctiveness of a performance.

martindholmes commented 6 years ago

@ebeshero The mayoral pageant book is essentially a script for a performance, and the movement of actors is described in the script in a way that serves precisely the same function as a stage direction in any other play. Whether the stage is the Globe Theatre stage or the city of London isn't the point; this is a <move> like any other <move>. In any case, in a conventional play script, which describes movement on a stage, it's perfectly conceivable that a taxonomy of stage locations might be created, which would be pointed at from @where, so there's nothing inherent to the mayoral pageant that makes this request special.

ebeshero commented 6 years ago

@martindholmes Thanks for explaining the context. Is there a difference between specifying the performance location and a specific production of a mayoral pageant, then? I also see that @perf is supposed to point to a <performance> element, which isn't quite the same as a taxonomy of places. I think even if this isn't a special request, there's something interesting and at least unusual about conceiving of distinct locations in London as stages for mayoral pageantry, and defining a taxonomy of places for each pageant makes sense. Originally I think you wanted @where to point to a location, but then the introduction of a new attribute,@whereRef came up. I guess it seems to me as if customizing @where to use it as a data pointer seems the most reasonable approach here, rather than introducing a new attribute. But I'm not sure why it can't just be customized to use teidata.pointer for this use, which isn't really the usual stage left, right, center application.

ebeshero commented 6 years ago

I think I now understand that your book of mayoral pageantry is conceived as ONE performance, with a single stage of London, and locations within London are basically movements around that stage. I'd been thinking in this thread that each instance of pageantry was its own discrete performance.

joeytakeda commented 6 years ago

@ebeshero Apologies for being less than explicit about what exactly a mayoral pageant entails! Your last comment is completely right; London is the stage for a mayoral show, with particular pageants (we could even call them scenes) physically occurring at particular places in London.

To return to my original request here: for MoEML, @where will always be a pointer. But I don't think that solves the issue that @where is teidata.word and not a pointer. I understand that teidata.pointer is a subset of teidata.word, but that doesn't resolve the issue that @martindholmes raises that something like

<move where="south"/>

could refer either to a file called "south" or refer to some code that may or may not be defined in the ODD or elsewhere. The TEI is replete with pairs of attributes that more-or-less have the same function, but one's a word and one's a pointer (another example: @rend and @rendition). Having a dedicated attribute for a pointer allows projects to encode their texts more explicitly, rather than having meaningless codes--wouldn't it be better to provide a method to make that code meaningful?

(NB: <event> also has an @where, but it is a teidata.pointer that is meant to point to a <place>.)

ebeshero commented 6 years ago

@joeytakeda It's an interesting problem to try to puzzle out! Well, I suppose @where was customized on <event> because we could only conceive of this as pointing to a specific location defined elsewhere. It's tougher to make that same call universally for @where on <move>. So I see your point about @whereRef, though I'll also return to wondering if adding <move> to att.canonical (so it could take @ref) would make sense?

joeytakeda commented 6 years ago

@ebeshero I don't think @ref is a bad solution, but is that a canonical name for the place where the character moves or a canonical name for the movement itself? That is, would @ref be conceived as a canonical name for the <move> itself and not the place on which that move happens. I'm thinking here of if there was a giant "eventography" of Shakespeare, with "Exit pursued by a bear" as an event called "event12345," wouldn't you want to put that URL in the @ref and not the place to which the <move> refers?

Of course, this might be a sort of silly hypothetical, but I do think @whereRef is the most unambiguous way of saying "This is a reference to the location that is being referred to by this <move>"

joeytakeda commented 6 years ago

Where are we on this ticket? I am still of the opinion that there isn't a good existing attribute that unambiguously denotes the specific place to which a <move> refers and that @whereRef would remedy that. I am happy to make a proposal in ODD if that would help clarify some use-cases for the @whereRef attribute.

ebeshero commented 6 years ago

@joeytakeda I'm reviewing tickets now and thinking about this one. You've provided some helpful examples and explanation. I need to make sure Council considers and discusses it--and we haven't had a good opportunity for that yet: this is the sort of ticket that we need to talk about in a face-to-face session. I think we'll have a good opportunity to do that at the next face-to-face meeting in September if not sooner!

ebeshero commented 6 years ago

@joeytakeda For what it's worth, you've persuaded me that the addition of @whereRef to pair with @where would be helpful, and that we have precedent for it with @ed and @edRef. I'm going to assign someone else on Council to review the ticket and see if others agree.

ebeshero commented 6 years ago

@martinascholger I'm in favor of adding @whereRef as @joeytakeda suggests following the discussion on this ticket so far, and I understand his use-case: I can represent it to Council. @joeytakeda, if you want to provide some example code and/or an ODD as you suggest, this is probably a good time for that.

sydb commented 6 years ago

I think I am also in favor of adding @whereRef is @joeytakeda suggests, but a) I could probably be talked out of it, and b) I find it amusing that my take on the values of @where and @whereRef are the opposite of his:

JT> Having a dedicated attribute for a pointer allows projects to encode their texts more explicitly, rather than having meaningless codes--wouldn't it be better to provide a method to make that code meaningful?

I think it is OK to allow encoders to use soon-to-be meaningless URLs, rather than force them to use codes that are well defined in their ODDs, and thus meaningful for the long term. :-)

joeytakeda commented 6 years ago

An example of this (taken from Webster's Monuments of Honour (which is in draft on MoEML): http://mapoflondon.uvic.ca/MONU1.htm and encoded as an exemplum):


<egXML xmlns="http://www.tei-c.org/ns/Examples">
                     <listPlace>
                        <place xml:id="STPA3">
                           <placeName>Paul's Churchyard</placeName>
                        </place>
                     </listPlace>
                     <!--...-->
                     <p>
                        This Shew hauing tendred this ſeruice to my
                        Lord vppon the Water, is after to be conueyed a
                        Shore, and in conuenient place employd for adorning the reſt of the Triumph. After my Lord Maiors
                        landing, and comming paſt Paules Chaine
                        firſt attends for his Honor in
                        <ref target="#STPA3">Pauls Church–yarde</ref>
                        a beautifull Spectacle, called the
                        Temple of Honor
                        <!--More prose describing the scene-->
                     </p>

                     <div>
                        <head>The ſpeech of Troynouant.</head>
                        <move whereRef="#STPA3"/>
                        <sp>
                           <lg>
                              <l>HIſtory, Truth, and Vertue ſeeke by name,</l>
                              <!--More stuff here-->
                           </lg>
                        </sp>
                     </div>

                  </egXML> 

The prose above (which is arguably distinct from the dramatic action of the performance itself) describes that there was a movement sometime between the last speech and the following speech from the Lord's Barge to Paul's Chain and then to Paul's Churchyard. The <move/> element is used within the dramatic action to denote that there was a movement that took place; the @whereRef denotes the particular place to which the movement refers.

(NB: Under different circumstances (i.e. if we didn't care about backwards compatibility) I think I would argue that move/@where and event/@where should just be harmonized (as att.locatable) so that <event> and <move> shared the attribute; I even think a case could be made that a <move> is a very specialized type of <event>, but I digress...)

ebeshero commented 6 years ago

@joeytakeda I see your point about how nice it would be if @where meant the same thing on these elements, but the use of <move> on dramas gives its @where a special directional quality. I have a couple of questions/suggestions about your example:

1) Would it help people understand what's going on here if you also marked the Lord's Barge and Paul's Chaine, so that it's a little more quickly clear to us that there are three locations "in play"?

2) Would you want to use @where together with @whereRef in your <move> element to help distinguish the usage of the two?

sydb commented 6 years ago
  1. I don’t know what @joeytakeda thinks, but I think it would be more understandable if all the names of places were so marked.
  2. I don’t think it makes a lot of sense to exemplify a practice (simultaneous use of both @where and @whereRef) that is at least questionable, if not a bad idea.
  3. Why is the <move> a child of the division “The ſpeech of Troynouant”, rather than preceding it?
joeytakeda commented 6 years ago

Thanks for the suggestions, @ebeshero and @sydb. Suggestions taken and put below is an ODD customization for @whereRef:

            <elementSpec ident="move" module="drama" mode="change">
               <attList>
                  <attDef mode="add" ident="whereRef" usage="opt">
                     <desc versionDate="2018-07-17" xml:lang="en">points to one or more locations that describe the direction of the movement.</desc>
                     <datatype maxOccurs="unbounded">
                        <dataRef key="teidata.pointer"/>
                     </datatype>
                  </attDef>
               </attList>
               <exemplum xml:lang="en">
                  <egXML xmlns="http://www.tei-c.org/ns/Examples">
                     <!--In the teiHeader-->
                     <listPlace>
                        <place xml:id="PAUL1">
                           <placeName>Paul's Chain</placeName>
                        </place>
                        <place xml:id="STPA3">
                           <placeName>Paul's Churchyard</placeName>
                        </place>
                     </listPlace>
                     <!--Elsewhere, in the body of the document-->
                     <p>This Shew hauing tendred this ſeruice to my
                        Lord vppon the Water, is after to be conueyed a
                        Shore, and in conuenient place employd for adorning the reſt of the Triumph.
                        After my Lord Maiors landing, and comming paſt <ref target="#PAUL1">Paules Chaine</ref>
                        firſt attends for his Honor in <ref target="#STPA3">Pauls Church–yarde</ref>
                        a beautifull Spectacle, called the
                        Temple of Honor
                        <!--More prose describing the scene-->
                     </p>
                     <move whereRef="#PAUL1 #STPA3"/>
                     <div>
                        <head>The ſpeech of Troynouant.</head>
                        <sp>
                           <lg>
                              <l>HIſtory, Truth, and Vertue ſeeke by name,</l>
                              <!--...-->
                           </lg>
                        </sp>
                     </div>
                  </egXML> 
               </exemplum>
               <remarks>
                  <p>Though <att>where</att> and <att>whereRef</att> are not mutually exclusive, it is recommended to use one of <att>where</att> or <att>whereRef</att>.</p>
               </remarks>
            </elementSpec>
sydb commented 6 years ago

Looks good. Issues (probably to discuss in Tokyo):

martindholmes commented 6 years ago

@sydb I see no reason to go for exclusivity; I can see a lot of scenarios in which (for instance) a project migrating away from @where in favour of more precision might need to maintain both; or @where might be used for a generic textual description that's traditional, while @whereRef provides a greater degree of precision.

On your second point: where is the <ldb> I seem to have been waiting half my life for?

joeytakeda commented 6 years ago

I agree with @martindholmes point about exclusivity; I think for some projects, it might make sense for them to be exclusive, but I don't think it's necessary make them mutually exclusive. So, I think the language @sydb used above ("Usually either") is that right way to go.

I've put <listPlace> in <sourceDesc>, but that's purely because there's no better place to put it in the header :-). If @whereRef happens to be added to the spec at the same time as <standoff> or <ldb>, then I think the <listPlace> for this example ought to go in there.

sydb commented 5 years ago

@martindholmes :

1) I'm not convinced that “a project might want both while it switches from A to B” is really a good reason to allow both in the schema. The schema is about interchange, not about intermediate states.

2) That said, I'm not convinced that exclusivity is important, either. While I think in general it is probably bad practice to have two attributes to express the same thing, I can imagine a project that uses @whereRef to specify a generic directionality on the stage (e.g., just left, right, upstage, downstage, using things like "http://www.example.org/stagemovement#left") and @where to mark a particular production-specific location on stage ("2 feet to stage L of lamp post"). So @joeytakeda probably has it right, to stick with my “Usually either” language.

3) Why on earth do you presume an @whereRef is more precise than an @where? (My example in (2) shows the reverse — but I also can picture @where being generic left, right, up, down, and @whereRef pointing to a precise category in a <taxonomy>. :-)

I expect we will discuss <ldb> at the Council conference call in an hour or two.

ebeshero commented 5 years ago

At F2F we spent a long time discussing this and decided on the following course of action, which is not exactly what the proposers intended: We want to standardize the use of @where in a single class so it is used the same way on <event> and <move>, and it will take infinity data.pointer. This shouldn't disrupt existing uses and should fit the use described here.

ebeshero commented 4 years ago

@joeytakeda I'm working on implementing Council's decision here, to create a new attribute class for @where and giving it the unbounded data.pointer value. I'm working on this in a branch, and am going to try adapting your example for St. Paul's churchyard on this ticket...More when I've figured it out and pushed in the branch. :-)

ebeshero commented 4 years ago

These changes in https://github.com/TEIC/TEI/commit/2b75afa9d2af07a357fff320d06ce2acfd158d82 are passing local build tests. But there's some new prose explanation of @where in its new class of att.locatable that we should take a look at. I'll do a pull request and ask for help reviewing this.

ebeshero commented 4 years ago

Well, they shouldn't have passed local build tests because I forgot to add specGrp includes, etc., which I didn't notice until I made a pull request...Working on it.

ebeshero commented 4 years ago

Okay! We're passing the Travis build test now and I think we're ready for Council to review this and see what we need to do before the upcoming release.

sydb commented 4 years ago

Council discussed this today, and charged me with the task of ascertaining whether all teidata.word values are also teidata.pointer values. They are not.

First, teidata.word permits the character ‘%’ (U+0025, PERCENT SIGN) in an unrestricted way. In a URI (and thus teidata.pointer), ‘%’ must be followed by two hexadecimal digits.

Also teidata.word allows colon anywhere. In a URI, while any number of colons are allowed to appear, the only characters allowed before the first one are [A-Za-z0-9+.-].

Thus “%left”, “50%_across_stage”, “höger:ner”, and “*up:left” are all valid teidata.word, but not teidata.pointer.

ebeshero commented 4 years ago

@sydb Well, these are mostly plausible values that we could imagine people using (esp. höger:ner), but they may be unusual enough to be rare? so I suppose we had better say exactly what situations could break backwards compatibility if we proceed with this ticket.

What should we do now?

sydb commented 4 years ago

Note: definition of URI per RFC 3986.

joeytakeda commented 4 years ago

Thanks for this @ebeshero and apologies for the delayed reply! Very happy to see progress on this ticket :-).

However, I'd like to ask why council thought moving to @move to a teidata.pointer was more desirable than the introduction of a new @moveRef attribute as proposed above? Isn't that backwards compatibility breaking in ways beyond whether or not words look like pointers? That is, even if the thing itself is incidentally valid, if we call the string OFFSTAGE a word and then it becomes a pointer, the encoding--probably unbeknownst to most encoders--is now wrong insofar as it points to some file on the system called OFFSTAGE, which may or may not exist.

ebeshero commented 4 years ago

@joeytakeda When Council discussed this at the face-to-face meeting in Washington, DC last year, we noted something that wasn’t part of the ticket you posted: @where is currently defined differently on the <event> element than it is on the <move> element. On <event>, @where already is defined as taking teidata.pointer as its value. So rather than introduce a new attribute, our decision was to reconcile these divergent definitions of the same attribute.

It makes sense, I think, as a move to clean up and simplify the way we’ve defined attributes in the Guidelines, since we now would assign @where to an attribute class, att.locatable, to which both <move> and <event> would subscribe. We think that in most cases the community using the <move> element won’t notice a problem, except for the kinds of values that as @sydb points out are not consistent with teidata.pointer. So if we proceed with this, we want to probably treat it something like a deprecation of old usage with instructions.

I think we are doing the right thing here, though it is a little vexing to have to potentially break backwards compatibility. I think it is right in part because the MOEML project examples show us how related these elements can be when stages are more than just structures, and when they can be locations in public squares, etc. And in the simpler cases of conventional stage movements, there can be considerable cultural variety in these. It seems a good idea to ask people to define and point to some definition of the values they are applying in a project.

joeytakeda commented 4 years ago

Thanks, @ebeshero, for the explanation! This all sounds exactly right to me, especially if this move changes from a sudden change to a deprecation with a warning of some sort.

sydb commented 4 years ago

Just want to say that I completely agree with @ebeshero’s summary, but with some shades of difference.

  1. I am quite a bit more uncomfortable with this break in backwards compatibility than most others seem to be.
  2. I think if we really wanted users to define the possible values of @where, we would do better making it teidata.enumerated than teidata.pointer. But that would not reconcile move/@where with event/@where, which is important.

So short of creating move/@toWhere (and explicitly deprecating move/@where), this seems like the right way to go. We do have to figure out how to deprecate a value that looks very much like a URI, but isn’t, though.

martindholmes commented 4 years ago

One approach to kindly deprecation would be to detect all of the current suggested values and give a Schematron warning for them. Anyone using <stage>/@where is likely to start from the suggested values, and then add their own, so I bet this would catch most cases.