Closed ttasovac closed 1 year ago
@ttasovac I was checking, and for cases such as "v. reciproco" or "v. ativo", I think we will need <gram type="pos"<abbr>xxx</abbr></gram>
.
Instead of:
<gram type="pos" norm="VERB">
<abbr>V.</abbr>
<gram type="reflexivity">reciproco</gram>
<p>.</p>
</item>
Even for the abbreviations list, could we have?
<gram type="pos" norm="VERB">v.</gram>
<gram type="reflexivity">reciproco</gram>
Not sure what you mean by xxx,but yes I would go for reflexivity as a gram type in those cases. That, however, we should discus in the issue raised by Jesse, I just didn't have time to deal with it this morning. I'm at the airport now...
But in abbreviation lists, I would not use gram at all - just list and item, and then abbr and expan.
Sorry. The xxx was <gram type="pos"<abbr>xxx</abbr></gram>
.
Originally, I have this:
<item>
<abbr type="POS" norm="verb">V.</abbr>
<subc>recipr.</subc>
<expan>Verbo reciproco</expan>
<p>.</p>
</item>
If we will change subc, don't we need gram type?
BOA VIAGEM!
Ana, you're mixing up two issues. What I wrote about above is that reintroducing abbr
and expan
has an undesired consequence that they are now allowed inside gram
and other dictionary-specific elements. But that has nothing to do with your simple lists of abbreviations. So forget about norm, subcategorization etc. You don't need any of that.
You want simple abbreviation and expansion; and you want to type your abbreviations (but that's because you want to do that for your dictionary, that's not required by TEI Lex-0.
<item>
<abbr type="gram">V. recipr.</abbr>
<expan>Verbo reciproco</expan>
<pc>.</pc>
</item>
Ok, understood. I will remove it. My fault.
The thing is: once you allow abbr and expan in the core module, they will pop up in the content models of a bunch of dictionary-specific elements as well. And I don't like it.
In principle, count me in on this, @ttasovac!
But … we should also take things like //choice/{sic,corr}
into consideration when discussing //abbr
. Obviously, entries may contain typos and other errors we would/could like to retain – especially in faithful digitizations of printed dictionaries. The modeling issue is essentially the same with abbr
and sic
: we would introduce a secondary annotation tier on top of the primary lexicographical annotation. That's always a hassle with inline mark-up.
Axel, you are alive! So happy to hear from you.
You are absolutely right, we have a need for correcting typos in the Portuguese dictionary we're working on as well. I'm knee-deep in Horizon Europe applications until March 9th, so I will only be able to come back to this after that... but I will definitely count on your help! :)
Ok, so here's what I did. As I explained above, model.phrase
and macro.paraContent
were the culprits for a great number of crazy elements that appear inside dictionary elements. abbr
and expan
were just tip of the iceberg: despite all of the customizations of TEI Lex-0, we were still inheriting from P5 affiliation
, idno
, email
, all sorts of names etc. inside form
, orth
etc.
This has been bugging me for a long time. Dictionary elements need their own phrase-level class and a para-level macro, because dictionary elements are not your regular paragraphs. That's why I have created macro.lexicalParaContent
and model.lexicalPhrase
to reduce the abundance of stuff allowed by P5 inside dictionary elements.
One day, when we have more time, we can explore to what extent this would be worth discussing with the Council, but for now, we have a mechanism for making the content of models of dictionary elements lexicographically more appropriate without breaking anything in regular paragraphs, which we use in front matter etc.
I will close this issue for now. @xlhrld , we'll address choice
sic
and corr
separately...
With 142f9f78cb870ce13d0a74171d0df3793626ac7b
abbr
andexpan
are now allowed in TEI Lex-0 dev-0.9.2. We have an obvious use case in the front matter of the Morais dictionary, and, really, pretty much any other dictionary out there: all kinds of abbreviations are usually listed before the "proper" dictionary content.Remember, this is about representing the content of the dictionary front matter the way the original author(s) created it — it's not about taxonomies etc. which we deal with in the header, and which we can point to from those simple lists...
So far, so good. The thing is: once you allow
abbr
andexpan
in the core module, they will pop up in the content models of a bunch of dictionary-specific elements as well. And I don't like it.For instance, while
<usg type="domain"><abbr>Med.</abbr></usg>
wouldn't be technically wrong because "Med." is an abbreviation, it would be lexicographically irrelevant and also superfluous considering that we can have@expand
onusg
to begin with, not to mention@norm
and@value
etc. And we've been recommending things like<gram type="pos">n.</gram>
from the beginning, so what would be the purpose of having<gram type="pos"><abbr>n.</abbr></gram>
? It would only make processing more difficult.But I don't want to rush anywhere with this. And this would certainly require some additional discussion. I am leaving this issue open as a reminder to myself (and anybody else who may be interested) to think about working out a general strategy for allowing or not allowing
abbr
andexpan
in dictionary-specific elements...abbr
andexpan
are members ofmodel.pPart.editorial
. They get into the content model of dictionary-specific elements viamodel.phrase
andmacro.lexicalParaContent
model.phrase
←model.pPart.edit
←model.pPart.editorial
andmacro.lexicalParaContent
←model.phrase
←model.pPart.edit
←model.pPart.editorial
Dictionary-specific elements with
model.phrase
in content modeldictScrap
etym
form
gramGrp
sense
xr
Dictionary-specific elements with
macro.lexicalParaContent
in content modeldef
gram
hyph
lang
lbl
orth
pron
stress
syll
usg