Closed jamescummings closed 4 months ago
This means opening the TEI to lots of potential weirdness, but, as many may recall, I'm not in favour of attempts to prevent human silliness by preemptively restricting the code base.
It feels more important that, among the potential silliness, some genuinely useful cases may happen that will result in widening the TEI's coverage. Those who complain about too much baroque are going to become less convincing, all of a sudden, when the baroque is there basically for the sake of the TEI header, while standOff stores whatever the given project needs.
I'm not as worried as Piotr over any potential weirdness here. All it means is that projects that use TEI, but really store their annotations in a different format, will actually use the TEI rather than abandoning it completely for other formats. Yes, I'd rather that they use <annotation>
properly, but there are many projects which aren't going to do so. This is real standOff type of data, but just because of processing workflows projects won't always put it in the correct TEI workflow. It seems a simple change to enable greater usage of the TEI without really causing problems for people who fully use the TEI. I don't really see many potential negative side-effects.
Oh but I did try to stress that I'm not worried. :-) Essentially, we fully converge on the potential usefulness of this.
Well it rather depends what you mean by "use the tei" doesnt it?
@lb42 -- if they are 'using the TEI' for their document encoding, their marking of named entities, etc. but the linking of those to LOD entities is stored in RDF rather than using
In general I think this change is unproblematic, will encourage those using the TEI in this manner to feel part of the TEI community, and just makes sense. It has no real side-effects for existing TEI users who aren't interested in doing this.
Lots of ifs in there @jamescummings ! My concern is the risk to the tei brand when it is used in this superficial way. For example when i did my search for uses of xenodata in the wild i was quite depressed to find projects systematically using it for data which really belonged in a tei header, alongside a vacuous tei header. If that isnt breaking the the conceptual model, what is?
@lb42 Yes, but I don't think that is the use-case here, but people having either xenoData that is standOff in nature wanting to put it in the 'right place' inside the standOff element, or people who are using standOff properly, but simultaneously store a copy of that serialised into a format they store in xenoData (e.g. RDF JSON serialisation of properly done standOff web annotation data model done in TEI, but where they have the JSON version of it for easy of display/rendering rather than having to do that on the fly.)
Hi @sabineseifert Just checking if this issue has come up for council discussion? It seems fairly straightforward to me and will encourage people who do have standOff-like xenoData to put it in a more appropriate place.
Not yet but I will try and put it further up the list for next meeting!
<xenoData>
inside <standOff>
. I am not sure if there should be any health warnings about its use or not.I agree @sydb that there will be people who abuse it in some way that I've not considered... But that is true of almost every change we make. I certainly know of a project which would benefit from this by wanting to embed their standoff xenodata in the 'right' place inside standoff. If that encourages other projects, then great. If some abuse it somehow, at least they are embedding their data in their TEI file rather than storing separately, which I think is a win.
Full Council discussion on April 13 at VF2F:
I think that
<xenoData>
should be allowed to appear inside<standOff>
. My reasoning for this is two-fold:<standOff>
is "a container element for linked data, contextual information, and stand-off annotations" and many LOD projects are using<xenoData>
for some of their processing workflows and don't create proper TEI structures for this. However, they sometimes mix and match and have some annotation in<xenoData>
and some in<standOff>
. So it would be useful for processing to be able to extract just the<standOff>
containing both of those in a single container element. This is an appeal to convenience.Some of those using
<xenoData>
are storing non-TEI standoff data, which is pointing locally into the document, e.g. with web annotations data model. While in an ideal world they would store this using<annotation>
inside<standOff>
, it is unlikely they'll change their processing. So in storing this information, it would be good if it could appear in<standOff>
but it will remain<xenoData>
, as that is where information like this belongs in the TEI abstract model. This is an appeal to semantics.Proposal:
<xenoData>
claim membership in model.standOffPart by adding<memberOf key="model.standOffPart"/>
just after: https://github.com/TEIC/TEI/blob/2075569ce78d975f4d160696c49b7d07083524c7/P5/Source/Specs/xenoData.xml#L19The <gi>xenoData</gi> element is also available inside the <standOff> element for those using it for stand-off markup.
" at the end of the paragraph at https://github.com/TEIC/TEI/blob/2075569ce78d975f4d160696c49b7d07083524c7/P5/Source/Guidelines/en/HD-Header.xml#LL2528C30-L2528C30