Closed anacastrosalgado closed 1 year ago
In Persian-Czech Dictionary we use <taxonomy>
element for this purpose.
<taxonomy xml:id="LeDIIR.taxonomy.grammar">
<bibl>Parts-of-speech</bibl>
<category xml:id="LeDIIR.taxonomy.adv"
n="46e4fe08-ffa0-4c8b-bf98-2c56f38904d9">
<catDesc xml:lang="en">
<idno>adv</idno>
<term>Adverb</term>
<gloss>An adverb, narrowly defined, is a part of speech whose members modify verbs for such categories as time, manner, place, or direction. An adverb, broadly defined, is a part of speech whose members modify any constituent class of words other than nouns, such as verbs, adjectives, adverbs, phrases, clauses, or sentences. Under this definition, the possible type of modification depends on the class of the constituent being modified.</gloss>
</catDesc>
<catDesc xml:lang="cs-CZ">
<idno>adv</idno>
<term>adverbium</term>
<gloss>Adverbium (příslovce). Jelikož i zde slovní druh charakterizuje významovou funkci českého ekvivalentu, řadíme do této kategorie i adverbializované skupiny tvořené pomocí předložek, příslovečné spřežky, např. [be-sádegí] (snadno), [baráje abad] (navěky), [az (rú-je) ettefágh] (náhodou), [tá hálá] (doposud). Jinak jsou typickými slovotvornými příponami adverbií např. z arabštiny přejatá přípona [-an], resp. přípona [-áne], která vzniklé odvozenině dává současně význam adverbia pro neživotné subjekty. O tvorbě adverbií v perštině viz t. <ref target="https://www.jahanshiri.ir/fa/en/adverb-formation"
type="external"/>
</gloss>
</catDesc>
<catDesc xml:lang="fa">
<idno>قید</idno>
<term>قید</term>
<gloss/>
</catDesc>
</category>
</taxonomy>
Hello, @daliboris ! Thanks for sharing. I will analyse it and then get back to you again. It is the first time I'm encoding the front matter of a dictionary. I'm delighted.
@daliboris , thanks. I have a classification proposal for the list itself. I was using type for this. The type attribute was used to distinguish types of abbreviations by their function, and the norm attribute was used in POS to follow Universal POS tags and expand attributes to supply the expansion. In the printed edition, we don't have any divisions, and the proposal is to have this classification: POS; usage (domain; time; geographic; sociocultural; textType; frequency); gender; number; grammar (verbs subcategories; degrees of adjectives; hint (chapter indications; other text). Are the abbreviations classified in the printed edition of the Persian-Czech Dictionary? Now I'm curious... I see "grammar" in your's taxonomy.
>>> @ttasovac @laurentromary
I love taxonomies so what @daliboris is suggesting has a special place in my heart 😄 But taxonomies go into <classDecl>
in the header.
What Ana is asking here, however, is for us to consider allowing lists and items for the encoding of, well, lists and items in the dictionary front matter. So it's not about the header, but rather about the content of the dictionary before entries proper. And there, we often find lists of abbreviations used.
So I think this is a good suggestion. And I would vote for it. (And implement it, once I figure out what the hell is wrong with my oXygen workflow...)
Hi @anacastrosalgado, I decided to use the @xml:id
attribute for different types of taxonomy/abbreviations. And each taxonomy (group of abbreviations) has its own label. In Persian-Czech Dictionary we use these ids:
LeDIIR.taxonomy.grammar
(POS, in fact; maybe I should rename it), LeDIIR.taxonomy.meaningType
(eg. figurative), LeDIIR.taxonomy.textType
(eg. historical), LeDIIR.taxonomy.socioCultural
(eg. formal),LeDIIR.taxonomy.complexFormType
(eg. idiom)Persian-Czech Dictionary is born-digital and will be available only via web and mobile application, so I'm not sure what do you mean by printed edition
. Do you mean printed version of generated web page?
Btw. I plane to generate list of abbreviations from taxonomies directly from
Regarding the front matter and <taxonomy>
: @ttasovac is right - <taxonomy>
can't be used in the <front>
(sub)element(s). It was my misunderstanding..
I've started working on this in the dev-0.9.2 branch.
Just to recap: we want to bring back list
and item
to TEI Lex-0 because the dictionary front matter often contains lists (of abbreviations, domains etc.) However — and this is really important for me — I don't want lists to pop up everywhere where they do in vanilla TEI, including inside certain dictionary elements:
etym
allow lists because they have model.inter
in their content model; and def
allow lists because their content model includes (my mortal enemy) macro.paraContent
.With b7cb41cdcc4fbdec665938f3d77bfa5fb9d16d87, I have taken care of point 1: list
and item
are back in the game, but I made sure that they are not allowed in dictScrap
, etym
, form
, gramGrp
and xr
.
You can test the schema in the 0.9.2 development branch by pointing to https://raw.githubusercontent.com/DARIAH-ERIC/lexicalresources/dev-0.9.2/Schemas/TEILex0/out/TEILex0.rng
This issue should stay open until I finish taking care of point 2 above.
I forgot to post here that 104d579b36cd8655650ab031d006758c6d3f7e69 in dev-0.9.2 fixed the issue 2 from above: def
, gram
, hyph
, lang
, lbl
, orth
, pron
, stress
, syll
, usg
can no longer contain list
.
I'm not going to release 0.9.2 just yet, but I am closing this issue because Ana's original request to have lists
and items
available has been taken care of.
As usual, feel free to reopen if you have any questions related to this.
Concerning the encoding of the introductory pages of MORAIS, I noticed that the TEI Lex-0 specification does not include the element
<item>
. As I’m encoding the list of abbreviations, this element is essential.Examples:
Can you bring back this TEI element? The same for
<list>
.