TEIC / TEI

The Text Encoding Initiative Guidelines
https://www.tei-c.org
Other
279 stars 88 forks source link

The term strikes back - terminology chapter #482

Open TEITechnicalCouncil opened 11 years ago

TEITechnicalCouncil commented 11 years ago

During the ISO TC 37 meetings in Pretoria in June, we discussed the future of ISO 30042 TBX and the possibility to fully define it as an ODD. I made a first start in the form of a smaller subset that could be the basis of a new terminology chapter for the guidelines. I attach my work so far, asking the council for their views on how to proceed further on this (putting it on source forge, setting a WG, appointing a contact person, etc.).

Original comment by: @laurentromary

TEITechnicalCouncil commented 9 years ago

This issue was originally assigned to SF user: louburnard Current user is: lb42

TEITechnicalCouncil commented 10 years ago

I think expressing TBX in ODD is a good idea, and that TEI-C should support the effort. I don't know much about TBX (nor terminological databases), but would guess that a joint ISO/TEI WG is a goo way forward.

I've made some tweaks to the ODD.

And now I'll try again to upload the modified version ...

Original comment by: @sydb

TEITechnicalCouncil commented 10 years ago

Thanks Syd. Maybe a call to get an overview of the work to be done and how to structure it would be ideal. Could the council express an in principle view as to the opportunity of the work, before we possibly bring in additional TBX expert.

Original comment by: @laurentromary

TEITechnicalCouncil commented 10 years ago

And since I leave traces of my deeds: http://tags.hypotheses.org/10

Original comment by: @laurentromary

TEITechnicalCouncil commented 10 years ago

Thanks Syd. Chow do we want to proceed on this. Are you the “official” contact point. Shall we finalize a first version that we send around to the TBX community for information/validation?

Le 10 nov. 2013 à 22:27, Syd Bauman sbauman@users.sf.net a écrit :

I think expressing TBX in ODD is a good idea, and that TEI-C should support the effort. I don't know much about TBX (nor terminological databases), but would guess that a joing ISO/TEI WG is a goo way forward.

I've made some tweaks to the ODD.

[feature-requests:#482] The term strikes back - terminology chapter

Status: open Created: Wed Nov 06, 2013 09:07 AM UTC by Laurent Romary Last Updated: Wed Nov 06, 2013 09:07 AM UTC Owner: nobody

During the ISO TC 37 meetings in Pretoria in June, we discussed the future of ISO 30042 TBX and the possibility to fully define it as an ODD. I made a first start in the form of a smaller subset that could be the basis of a new terminology chapter for the guidelines. I attach my work so far, asking the council for their views on how to proceed further on this (putting it on source forge, setting a WG, appointing a contact person, etc.).

Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/tei/feature-requests/482/

To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

Laurent Romary INRIA & HUB-IDSL laurent.romary@inria.fr

Original comment by: @laurentromary

TEITechnicalCouncil commented 10 years ago

Original comment by: @jamescummings

TEITechnicalCouncil commented 10 years ago

At Oxford 2013-11 LB, KH and SB to articulate a more generic policy for TEI on the integration of external standards and will generate a one-page proposal for this policy. This should be provided to Council ahead of the next teleconference. In addition, LB, KH, and SB will check with Laurent that he wants TBX incorporated into the Guidelines as is or kept in sync in the future and what he thinks about melding into the rest of the TEI Guidelines (in language, approach, element naming conventions) or keeping it self-contained.

Original comment by: @jamescummings

TEITechnicalCouncil commented 10 years ago

The objective is clearly to incorporate this step by step into the guidelines so that we have a reference TEI extension (module?) for terminological (onomasiological) data. There is some thinking to be don eon how to optimise the compatibility with other usages of TBX while merging into the TEI mechanisms. I will move ahead with the doc (on SB's update basis). Anyone is planning to follow this closely?

Original comment by: @laurentromary

TEITechnicalCouncil commented 10 years ago

The ticket has been assigned to me, so I am certainly going to be following it, as closely as other committments permit. The Council meeting was unclear as to whether you were aiming to produce/promote a TEI extension -- effectively your current ODD -- which stands independent of the TEI and would not form part of the Guidelines (except perhaps by reference), or whether you plan eventually to convert this to a component of the Guidelines, either as a new or as part of an existing module. The latter is (obviously) much more work, and might necessitate some major changes in the current TBX world view, but it's certainly worth trying.

Original comment by: @lb42

TEITechnicalCouncil commented 10 years ago

Following discussion with Laurent this morning, it's clear enough that what is planned here is to provide a TEI equivalent for TBX, enabling the useful parts of TBX to be embedded in a natively TEI document. As a first step, LR wants to be sure his ODD is compatible with current TEI practice and can (eventually) be handled in the sf tree along with the rest of the Guidelines. At his request, I've created a new Incubator/TBC directory in the main SF Trunk and put the latest draft there. I expect to review it and add comments later this week.

Original comment by: @lb42

TEITechnicalCouncil commented 10 years ago

It sounds like Laurent and Lou are figuring out whether this would really be its own chapter of the Guidelines. (Maybe they discussed this on 2013-12-03, but I can't tell from Lou's notes what the outcome of that was.)

Still, if the proposed text would constitute its own chapter, it's not clear to me why this wouldn't be incorporated into the existing chapter on dictionaries. That is, why create all new elements (e.g., termEntry) instead of using elements already defined in the dictionaries chapter (e.g., entry) with a compatible meaning and then add new elements when an appropriate element doesn't already exist?

Original comment by: @kshawkin

TEITechnicalCouncil commented 10 years ago

I think deciding whether this work is best presented as a separate chapter or as a new section of an existing one is some way down the track. If you look at the draft you'll see that it's still at a fairly preliminary stage. My discussion with Laurent was chiefly about what needs to be done to the draft to move it along from a stylistic point of view, but also to call attention to parts of it which are not yet clear enough for Council to take a view on.

I'll leave it to Lauren to answer Kevin's specific query above, but my take on it is that an entry in a lexical database is a very different kind of thing from an entry in a print dictionary, with different semantics and an entirely different internal structure, so there is good reason to propose a new element for it.

Original comment by: @lb42

TEITechnicalCouncil commented 10 years ago

The main reason why we need a specific representational model (not just hack the dictionary entry element), is that the model for terminological entry is an onomasiological representation which goes from concept to term as opposed to the semasiological one implemented in word to sense dictionaries. See http://tags.hypotheses.org/10 for a quick overview or http://hal.inria.fr/inria-00100405 for an in-depth presentation of the underlying principles.

Original comment by: @laurentromary

TEITechnicalCouncil commented 10 years ago

Okay. I was thinking of loosening the content model of <entry> to account for concept-to-term entries, but perhaps it is more sensible to create a new element (<termEntry>) that can have a tight content model to be used only for concept-to-term entries, leaving <entry> for just term-to-concept entries.

Original comment by: @kshawkin

TEITechnicalCouncil commented 10 years ago

Is it possible that the issue of list type="gloss", which will be the only remaining list/@type value after we rework our list recommendations, might be solved by the idea of a <termEntry>?

Original comment by: @martindholmes

TEITechnicalCouncil commented 10 years ago

I think it would. Can you give me an example I could code?

Original comment by: @laurentromary

TEITechnicalCouncil commented 10 years ago

This is an example from chapter 3.7:

<list type="gloss">
 <head>Unit Three — Vocabulary</head>
 <label xml:lang="la">acerbus, -a, -um </label>
 <item>bitter, harsh</item>
 <label xml:lang="la">ager, agrī, M. </label>
 <item>field</item>
 <label xml:lang="la">audiō, īre,
   īvī, ītus </label>
 <item>hear, listen (to)</item>
 <label xml:lang="la">bellum, -ī, N. </label>
 <item>war</item>
 <label xml:lang="la">bonus, -a, -um </label>
 <item>good</item>
</list>

Original comment by: @martindholmes

TEITechnicalCouncil commented 10 years ago

This is indeed an interesting borderline case and it all depends how this sample is perceived. If you have a semasiological view on these lexical groups, you may consider the entry point is the latin word and the rest are glosses expressing its sense(s). This leads to a normal usage of [entry]:


        <entry>
          <form xml:lang="la">
            <orth>acerbus</orth>
            <form type="inflected">
              <orth>-a</orth>
            </form>
            <form type="inflected">
              <orth>--um</orth>
            </form>
          </form>
          <sense>
            <gloss xml:lang="en">bitter, harsh</gloss>
          </sense>
        </entry>

If you have an onomasiological view on this, you will consider each block as the description of a concept, for which you identify the possible ways of expressing it in any number of languages (here la and en, but you can add as many as you want ad libitum). This is the realm of TBX and the (simplified) encoding may look like this:


       <termEntry xmlns="http://www.tbx.org">
          <langSet xml:lang="la">
            <tig>
              <term>acerbus</term>
            </tig>
          </langSet>
          <langSet xml:lang="en">
            <tig>
              <term>bitter</term>
            </tig>
            <tig>
              <term>harsh</term>
            </tig>
          </langSet>
        </termEntry>

Original comment by: @laurentromary

TEITechnicalCouncil commented 10 years ago

Laurent has now published an article explaining his ideas for this revision in more detail : http://hal.inria.fr/hal-00950862

Original comment by: @lb42

TEITechnicalCouncil commented 10 years ago

With the paper you'll find a zip with the current ODD/rnc and a test file. Depanding on feedback, I'll update the paper and the ODD. I should not be presented before June (submitted at TKE in Berlin).

Original comment by: @laurentromary

TEITechnicalCouncil commented 10 years ago

FYI, that generic policy is in process: https://sourceforge.net/p/tei/feature-requests/507/ .

Original comment by: @kshawkin

TEITechnicalCouncil commented 9 years ago

Laurent, could you update us on any feedback you got from Berlin?

Original comment by: @martindholmes

TEITechnicalCouncil commented 9 years ago

The paper and the ODD have been updated. We have a stable spec although it would require (sigh) more prose and example. I am now recommending this customization for people wanting to do onomasiological descriptions in TEI documents. Can council give guidance as to the way forward? Shall we have a little task force of motived colleagues (volunteers?)? First step could be to have this on GitHub as well.

Original comment by: @laurentromary

TEITechnicalCouncil commented 9 years ago

Please could you check the latest stable spec into the Sourceforge "Incubator" repository (or send it me by email and I will).

Original comment by: @lb42

lb42 commented 8 years ago

No sign of life on this ticket since May 2015, so marking it blocked.

laurentromary commented 8 years ago

The thing is being experimented n a couple of projects. Whatever blocked means, we can wait until there is a window of opportunity with the council.

lb42 commented 8 years ago

"Blocked" means we can't progress it because we don't have a draft to consider.

jamescummings commented 7 years ago

@laurentromary Is there a draft now, or something concrete for council to consider? It seems like from the previous comments that there wasn't. We could close this ticket and open a new one when there is something active for council to debate. What is the status?

laurentromary commented 7 years ago

@stefanpernes has been working on this with me in the last 6 months. We need to get in synch to compile a good proposal. Would the F2F in Vancouver an appropriate time for you to have this on your plate?

jamescummings commented 7 years ago

I'm sure Council could give it some face to face time in Vancouver, assuming we've had time to read it, etc.