TEIC / TEI

The Text Encoding Initiative Guidelines
https://www.tei-c.org
Other
282 stars 84 forks source link

definitions in the text: markup and handling #2358

Open bansp opened 2 years ago

bansp commented 2 years ago

Consider consistently marking up definitions in the text, so that they could be then (a) referenced across the Guidelines and (b) collectively displayed in some kind of drop-down section within a chapter, and (c) collected and displayed across the entire Guidelines, in the, say, front section, under the header "Definitions" and with references to where they appear. Note that this is different from descriptions inside individual specs -- here I speak of terminological animals.

The direct inspiration for this FR is the term "interpretation" that receives its own definition in the FSR chapter. I also noticed that both <term> and <soCalled> are used to highlight the terms there, and I am not sure how consistent this practice is. So, what I'm thinking of, very schematically, would be something like <term> and <gloss> (simply because they are already expected to be used like this, compare the spec for 'gloss'). More specifically, here's a fragment from 18.11:

In particular, the term ‘interpretation’ when applied to a feature structure is not an interpretation in the model-theoretic sense, but is instead a minimally informative (or equivalently, most general) extension of that feature structure that is consistent with a set of constraints declared by an FSD. In linguistic application, such a system of constraints is the principal means by which the grammar of some natural language is expressed.

I could imagine this fragment encoded in the Guidelines roughly like the following:

<p> ...
In particular, the term <term xml:id="interpretation-FSR">interpretation</term> when applied to a feature
structure is not an interpretation in the model-theoretic sense, but is instead <gloss ref="#interpretation-FSR">a minimally informative (or equivalently, most general)
extension of that feature structure that is consistent with a set of constraints declared by an FSD</gloss>.  In linguistic application, such a system of constraints is the principal means by which the grammar of some natural language is expressed.
...
</p>

That would allow scripts to collect the terms and definientia and display them in a subsection at the beginning of the chapter, and then maybe the definiens could pop-up when the user hovered over <term ref="#interpretation-FSR">interpretation</term> (notice the @ref rather than @xml:id).

In some ODDs (and the crux of the request is that this goes way beyond the TEI Guidelines ODD), I could imagine separate lines such as

<entry type="definition">
     <term ident="interpretation-FSR">interpretation</term>
    <gloss>a minimally informative (or equivalently, most general)
                extension of that feature structure that is consistent with a set of constraints declared by an FSD</gloss>
</entry>

(I realise that the above is not valid -- I'm just indicating what usage scenarios I could imagine for such semantic elements -- block definitions in some ODDs, inline in others, but in essence identically marked up, and I'm not saying it should be entry that contains them, though why not; maybe entryFree if that feels better).

ISO definitions are free, for one thing; they could probably be used directly in some ODDs.

bansp commented 2 years ago

Working on #2359 , I realised that it might (just 'might'; I'm sharing this while it's hot) make sense to distinguish between 'just mentioning' a term vs. mentioning it with a purpose to make it stand out and possibly flash its definition. <soCalled> and <term> could do the job, respectively, and one could then wonder if the semantics of the text is in any way spoiled if a term is 'soCalleded' when mentioned in passing, and 'termed' where it's relevant. As I said above -- just sharing a possibly naive (or subversive!) thought.

bansp commented 2 years ago

I wonder if it would be in any way outrageous to style a term[ident] as bold, and term[ref] as italicized. That would keep the semantics and handle the rendering at the same time.

lb42 commented 2 years ago

It would be better imho to use the @type on term to inficate its status.

bansp commented 2 years ago

More versatile, definitely. I've seen <term rend="noindex"> in the FS chapter; not sure what that does, precisely (a strange value for @rend, isn't it?) One way or another, that seems to refer to another valid dimension: indexing of the terms. I'm quite excited by this in connection with standards work (I think it would add another advantage to using ODD, and TEI serialization in general, for some standards documents), but, surely, the applications could be much wider.

ebeshero commented 2 years ago

@sabineseifert and @martinascholger This seems like a ticket that might want to be addressed in a small group to formulate some ideas--feel free to reassign, but I hope you two can give this a good start!

hcayless commented 1 month ago

Note (from F2F Buenos Aires): <gloss> does not have @ref, but it is a member of att.pointing, so it has @target, which seems a perfectly good substitute.

bansp commented 1 month ago

Thanks to the Council for having a look. Have fun over there! :-)

trishaoconnor commented 1 month ago

Council agrees to review how we are currently marking up technical terms and definitions and to add guidance to TCW 20 and TCW 24 to ensure that technical terms are encoded consistently going forward. @martinascholger and @sabineseifert are currently compiling a list of the all the elements which have been used to mark up technical terms.

sabineseifert commented 1 month ago

related to https://github.com/TEIC/TEI/issues/2602