TEIC / TEI

The Text Encoding Initiative Guidelines
https://www.tei-c.org
Other
276 stars 88 forks source link

Proposal for event and eventName #2427

Closed skurzinz closed 1 year ago

skurzinz commented 1 year ago

Fixes #2382.

Following the checklist @sydb prepared in https://github.com/TEIC/TEI/issues/2382#issuecomment-1475392687, and going a bit into detail on the whole <event> discussion beyond <eventName>, we proceeded in our attempt to not only include this new element but also align the <event> element with other nameable elements that are used to create ographies.

In our PR it is located between <placeName> and <objectName> in the Names section.

We tried to be minimally invasive, but moved the description of <event> in general out of the <person> section while keeping the useful information on Personal Events in place. This results in a new section on events on the same level as <person>, <org>, <place>, <event> and <object>.

An attempt at the latter is contained in https://github.com/TEIC/TEI/commit/62b1b9a35ac9227c407653246c6a0131f5c499ad, but is this enough?

With regard to this we did not alter the existing examples, which use <head> and <label> interchangably, yet provided example use cases for (“canonically“) named events that may even have authority file data associated. This keeps it open what editors choose to label events.

In addition, we changed the content model of <listEvent> and <event> to allow other lists to be contained in them. The rationale was to align the <event> with other ography elements: A <listPerson> contained in an <event> may be used to list persons who were involved in it (e.g. A, B and C were taking part in the nth session of parliament), a <listPlace> may be used to describe locations of an event (e.g. the filming of The Third Man was done in Vienna and London), and similar relations.

We are first creating this PR and will shortly follow up on this on TEI-L.

skurzinz commented 1 year ago

For reference, @helmutwklug and @onbcst were conceptionally involved in the discussion leading to the proposed changes.

onbcst commented 1 year ago

Thanks for drafting the pull request. From my point of view allowing <listPerson> and <listPlace> in <event> makes totally sense but I think they are not necessary as child elements of <listEvent>. Therefore I suggest removing

<alternate minOccurs="0" maxOccurs="unbounded">
  <classRef key="model.personLike" minOccurs="1" maxOccurs="unbounded"/>
  <elementRef key="listPerson" minOccurs="1" maxOccurs="unbounded"/>
</alternate>
<alternate minOccurs="0" maxOccurs="unbounded">
  <classRef key="model.placeLike" minOccurs="1" maxOccurs="unbounded"/>
  <elementRef key="listPlace" minOccurs="1" maxOccurs="unbounded"/>
</alternate>

again from listEvent.xml. It would also match the behavior similar to listPerson and listPlace which also allow only nested lists from the same type.

skurzinz commented 1 year ago

Thanks @onbcst, fully agreed. That was a lapsus on my side.

sydb commented 1 year ago

Wow, nice job!

(I have to admit, I did not actually expect someone to follow the checklist, but am very pleased you did!)

I have not had a chance to look at this carefully,[2] but can quickly sketch out some next steps. I have followed each step with my suggestion for who should do that work, but that is just my off-the-cuff idea, not anything carved in stone. Details on some of the tasks are listed below the list of them.

  1. Fix double inclusion of the <elementSpec>s for <event>, <eventName>, and <listEvent> (proposers)
  2. Fix duplicate IDs in Guidelines encoding (proposers)
  3. Figure out when <eventName> ends up in model.nameLike twice, and fix (Council)
  4. If <listEvent> and <event> are still undefined, figure out why and fix (Council)
  5. Fix <listEvent> content model (it’s ambiguous) (Council)
  6. Generate readable Guidelines and schemas (Council)
  7. Read them and hammer out details (both)

1. Fix double inclusion of the <elementSpec>s for <event>, <eventName>, and <listEvent> (proposers)

@skurzinz correctly worried if commit 62b1b9a was correct, but was worried it was not enough — the problem is it is too much. 😄 The <eventName> element is included twice, both from ND, once from line 771, and then again from line 2344. Same is true for <evnt> and <listEvent>, which are included from both #DCCAHPA and #DNDEVNT.

2. Fix duplicate IDs in Guidelines encoding The first round of validity errors are all duplicate values of @xml:id:

    <ERROR>id 'SecondDefPrague' used more than once</ERROR>
    <ERROR>id 'Prague' used more than once</ERROR>
    <ERROR>id 'ThirtyYearsWar' used more than once</ERROR>
    <ERROR>id 'BattleofRocroi' used more than once</ERROR>
    <ERROR>id 'Rocroi' used more than once</ERROR>  [1]
    <ERROR>id 'event01' used more than once</ERROR> [1]
    <ERROR>id 'event02' used more than once</ERROR> [1]
    <ERROR>id 'event03' used more than once</ERROR> [1]

[1] These errors went away when I removed all of DNDEVNT.

3. Figure out when <eventName> ends up in model.nameLike twice, and fix 4. If <listEvent> and <event> are still undefined, figure out why and fix

These two will likely go away when (1) is fixed, but someone should double-check.

5. Fix <listEvent> content model

Because <event> is a member of model.eventLike, the two cannot occur next to each other in content (unless both are limited to a specific number of occurences).

6. Generate readable Guidelines and schemas

Either with or without CI server.

7. Read them and hammer out details

Probably the hardest part. 😄

[2] Because when I first saw this I had < 18 hours before leaving for the TEI Council meeting. I am currently in the airport. ✈️

skurzinz commented 1 year ago

Many thanks @sydb for your comments and further tasks.

  1. is done in 14376d5. I am marking this as checked in the checklist above.

  2. above is that I used the same example both for the <eventName> and <event> specs, just copypasting.
    This being the first time I dig into the Guidelines, I did not find a quick way of instead referencing it in one of the places. May I ask any of you to just fix that or tell me how to best address this?

sydb commented 1 year ago

[Wow. You’re fast. I am going to have trouble keeping up.]

1. Excellent.

2. First, no criticism (at least not from me). For your first dive into editing the Guidelines, you are doing a marvelous job. I am impressed. Second, I am sorry to say I don’t think there is any way to reference an example (so that the same example is encoded once but appears in 2 places). The best practice, of course, is to come up with a different example. But often you will find that the same example is copied except for the ID, which is slightly different. Even with a different (but similar) example, one often wishes to use the same ID, and we just can’t. For example, there are three examples of <object> with the ID values "Alfred-Jewel", "AlfredJewel", and "Alfred_Jewel". Third, I would be happy to do that, but I would not write-access to your repo, no?

skurzinz commented 1 year ago
  1. is now fixed in 7d8a834 by swapping out the Thirty Years’ War example against a World War II example with multiple @xml:lang variants, thus hinting at another possible set of event naming issues for encoders to consider.
skurzinz commented 1 year ago

Third, I would be happy to do that, but I would not write-access to your repo, no?

I allowed maintainer changes on merging the PR, but didn’t use that myself. Just to be sure I invited @sydb as a collaborator in my fork.

From the task list above, I don’t see any further things for us as proposers to work on currently – much looking forward to the Read them and hammer out details step, while continuing to monitor this…

skurzinz commented 1 year ago

ea40022 adds <object>s as possible agents within an <event>, which I had overlooked. Example: The <objectName ref="#MinsterLovellJewel">Minster Lovell Jewel</objectName> from objectName.xml here is discovered, and this may be formalized as an <event> including both the <person> in the active role of discovery and the <object> in the passive role.

skurzinz commented 1 year ago

Council F2F at Guelph is very, very impressed with this, and we think it won't take much work to fix the small problems noted here and include in the next release.

Good to hear, but: The Council should be satisfied with its working community, the annual conference and itself! The conversation we had during and after the panel at TEI2019 in Graz already included most of the points of this PR and set the agenda. During the panel discussion, AFAIR @martinascholger, @lb42, @martindholmes, @jamescummings, @sydb (to name who I remember from an event from 4 years back) were positive about the general idea and raised a few details that would have to be worked out. It only took us some time to sit down and do that.

Numerous discussions with @dasch124, @sennierer and @csae8092 and others contributed as well—credit where it’s due.

We want to try to create some more examples, too in process.

If real life examples are needed I can also provide snippets from the @oeaw-ministerratsprotokolle for named governmental meeting session events (currently using <label> in lieu of <eventName>). Any suggestions on what kind of other examples may be useful?

sydb commented 1 year ago

Any suggestions on what kind of other examples may be useful?

It would seem to me a crime not to use the TEI Conference in Graz as an example. 😃

sydb commented 1 year ago

I have just pushed 6c68379a7 to your repo, @skurzinz, without having accepted your invitation. I did not know that was possible; thank you.

In any case, this commit fixes the ambiguities from the <event> and <listEvent> content models while retaining the same set of constraints (I hope). It passes all the build tests I can run on my little laptop here. Presuming it passes the rest of the tests (which I can run when I get home), I think we are done with the list of fixes.

In which case, the only thing left to do before generating outputs (which I forgot to mention previously) is to merge TEIC/TEI:dev into skurzinz:dev, so the latest changes are there.

skurzinz commented 1 year ago

In any case, this commit fixes the ambiguities from the <event> and <listEvent> content models while retaining the same set of constraints (I hope).

I hope so too, the idea being that <event> should have 0-n possible child::*[name()=('person','event','place','object','org')] and their respective list* forms. As I said I was just copying over from the object models and did some possibly poisonous changes :)

skurzinz commented 1 year ago

In which case, the only thing left to do before generating outputs (which I forgot to mention previously) is to merge TEIC/TEI:dev into skurzinz:dev, so the latest changes are there.

Thanks @sydb for reminding about that, of course that’s necessary. Seemingly, all the maintainers of the upstream repo are able to push to the PR, but in any case: Let’s not forget that. Many thanks!

skurzinz commented 1 year ago

It would seem to me a crime not to use the TEI Conference in Graz as an example.

Done in 3e69268, could be replicated in event.xml as well.

sydb commented 1 year ago

Latest commit tweaks the content models and examples so that almost all the examples are now valid. The sole exception is “Platter”, Filip Fabricius’ additional name in the tagdoc for the <event> element, which is still invalid. (The <addName> is not valid as a direct child of <person>; it would be valid as a direct child of <persName> as in

        <persName>
          <forename>Filip</forename>
          <surname>Fabricius</surname>
          <addName>Platter</addName>
        </persName>

but if we did that, then the encoding of Filip’s name would not be parallel to that of Jaroslav’s and Vilém’s, and I did not know if you would want to change those, too, or nuke the “Platter”.)

And indeed I do have write access to your repo (as demonstrated by these commits), and I do not mind doing the work of merging in the latest from TEIC/TEI:dev myself. However, I do not actually know how to do that.

As soon as “Platter” is fixed and TEIC/TEI:dev is merged in (whether someone else does it or someone tells me how to, which I would mildly prefer, because I really should know this, eh?) I can create outputs for us to play with.

skurzinz commented 1 year ago

With 4d39c1c8 above my branch is in sync with upstream TEIC:dev.

I already took care of merging those upstream changes back, but for reference or if more commits get added there in the meantime: I got these by merging TEIC:dev using a new PR https://github.com/skurzinz/TEI/pull/1 (just do a new PR).

Alternatively, this could be done locally along the lines of https://stackoverflow.com/questions/12921125/git-merge-branch-of-another-remote (add another remote, pull or merge into a new local branch, merge new local branch into the fork:dev, commit this).

sydb commented 1 year ago

See the generated Guidelines and Exemplars. (Note: that is a temporary location, and the outputs there will likely be replaced by new versions as we move this issue along, and likely will be deleted once this PR is merged.)

skurzinz commented 1 year ago

Remark (which may be a new issue) while checking this (thanks to @sydb for preparing the previews!):

In current P5, <note> seems allowed as a child of <org> but not the other entity-like elements (person, place, event, object, relation, [name]). Is that intentional? What’s the reasoning behind it?

Edit: sorry, I was lost in my tests…

ebeshero commented 1 year ago

@skurzinz Indeed, <note> is permitted as a child of the other "ography" elements (<person>, <place>, etc) in the NamesDates and Core modules. https://tei-c.org/release/doc/tei-p5-doc/en/html/ref-note.html I rely on this extensively in the "listography" files for my projects.

So we have no new issue here!

skurzinz commented 1 year ago

So we have no new issue here!

@ebeshero you are right, it was a positional thing: I tested with a <note> after several <list*> in an <event> and that did not validate. Withdrawing my comment from above…

skurzinz commented 1 year ago

e2c432f (referring to my comment on treaties as events at https://github.com/TEIC/TEI/issues/2382#issuecomment-1525143951) and 1f1aaa4 are addressing minor issues that came to mind when deleting my own comments in d0fa3c2, hopefully improving guidance and possible ways of encoding <event>s.

Other than that, I tested some examples I could come up with using the tei_all.rng created by @sydb and did not find any issues, but I may be blind by being directly involved, so I am pinging @onbcst @helmutwklug @ChristianeFritze once more to check their examples (I just made the stuff in our Events_Beispieldaten in the onb repo valid, but you may have more) against the temporary https://bauman.zapto.org/~syd/temp/tei/Exemplars/tei_all.rng and report back any issues.

My next commit will include a real life example drawn from the @oeaw-ministerratsprotokolle added to Specs/event.xml as promised in https://github.com/TEIC/TEI/pull/2427#issuecomment-1537543038.

sydb commented 1 year ago

I pulled from the TEIC repo (turned out to be nothing needed), have tweaked things a bit, and regenerated both Guidelines and Exemplars.

Note that in e80feacde I also removed the extraneous new "es" <desc>. (As the content of the two was precisely the same, I presumed the error was adding the new one, as opposed to failing to delete the old one. If I am wrong about that, a Spanish speaker actually checked that description against the English version on 2023-05-07, whoever did that should just update the date.)

Is it maybe time to get multiple eyes reading this and (as I put it earlier) “hammer out the details”?

sydb commented 1 year ago

Noting that some (potential) changes have been made, I have just regenerated both Guidelines and Exemplars.

ebeshero commented 1 year ago

Council F2F at Paderborn: We need to alter the content model of <event> to preserve backwards compatibility--at the moment, the current content model does not allow for idno followed by a head followed by a p.

So the content model will change like so:


      <elementRef key="idno" minOccurs="0" maxOccurs="unbounded"/>
      <classRef key="model.headLike" minOccurs="0" maxOccurs="unbounded"/>
      <alternate>
        <classRef key="model.pLike" minOccurs="1" maxOccurs="unbounded"/>
        <classRef key="model.labelLike" minOccurs="1" maxOccurs="unbounded"/>
        <elementRef key="eventName" minOccurs="1" maxOccurs="unbounded"/>
      </alternate>
      <alternate minOccurs="0" maxOccurs="unbounded">
        <classRef key="model.noteLike"/>
        <classRef key="model.biblLike"/>
        <elementRef key="linkGrp"/>
        <elementRef key="link"/>
        <elementRef key="idno"/>
        <elementRef key="ptr"/>
      </alternate>
<!-- [... listPerson etc ] -->
      idno*,
      model.headLike*,
      ( eventName+ | model.pLike+ | model.labelLike+ ),
      (
         model.noteLike
       | model.biblLike
       | linkGrp
       | link
       | idno
       | ptr
      )*,
 # listPerson etc
joeytakeda commented 1 year ago

Thanks very much indeed for all your work on this issue, @skurzinz🙂 — we are currently merging into a feature branch to address a few small changes before we merge fully into the dev branch.

skurzinz commented 8 months ago

Just for reference I am adding the follow-up issue #2499