Closed ambs closed 3 years ago
Dear Alberto,
the section you are referring to is not showing a list of allowed elements in TEI Lex-0, but discussing what options exist in TEI Guidelines themselves.
I'm not seeing the example in which mentioned is used — it may have been swallowed up by GitHub's formatting. If you are by chance here referring to the example in https://dariah-eric.github.io/lexicalresources/pages/TEILex0/TEILex0.html#TEI.etym, i.e. in the Elements specification, this is by default taken over from TEI itself and has long been a source of annoyance for me because I know this is super confusing for the users. We should probably try to find a way to overwrite the examples in the elements specification with our own, but I haven't had time to do that — and it would be quite a lot of work to do it for each element we allow in TEI Lex-0.
Finally, as for the TEI Lex-0 recommendation for etymologies, a few of us have worked on that but the results are still only in a paper, have not been fully discussed with the wider group and have not been distilled for the Guidelines. But you can check out the paper here:
https://hal.inria.fr/hal-03108781
I will close the issue now, but feel free to reopen if you have further questions.
(I was going to close this, too. But anyway, I'll leave some further hints since I wrote this already.)
There is ongoing work in modeling the etymology section in TEI Lex-0. The main ticket in this regard is #26.
The general approach will be: use cit
instead of the severely restricted mentioned
. cit
allows for a much more detailed representation of what's mentioned, including grammatical properties, definitions or quotations. The markup becomes a bit more complex, though, e.g. (minimally):
<cit type="etymon">
<form>
<orth>mentioned_word</orth>
<!-- possibly variants -->
</form>
<!-- possibly grammatical properties via ./gramGrp -->
<!-- possibly definitions via ./def -->
<!-- … -->
</cit>
I have the feeling I would know at what time to start our TEI LEx 0 meetings ;-). But yes, the more structured representation mentioned by @xlhrld allows one to search etymological content precisely, which is what we need across varieties of lexical sources.
Thanks for the suggestions and pointers.
My main concern is that some words mentioned in etymology (word origins) are repeated across the dictionary a lot. So, it will not make any sense to have grammatical properties or definitions on it. Probably it can be seen more as a link (well, a broken link, probably) to that word in any other dictionary/resource (although this will not make the dictionary self contained, it will benefit by guaranteeing no duplicate information in the resource).
Accordingly with the current schema, it looks like
mentioned
is not possible inside theetym
. In https://dariah-eric.github.io/lexicalresources/pages/TEILex0/TEILex0.html#index.xml-body.1_div.6_div.1 it has the list of allowed elements, andmentioned
is not one of them. Nevertheless, in the example below,mentioned
is used in the example. This should be fixed.I take the chance to ask what is the suggestion to replace
mentioned
when annotating a foreign word (origin). Thank you