DARIAH-ERIC / lexicalresources

Data space of the DARIAH Lexical Resources Working Group
https://dariah-eric.github.io/lexicalresources/
BSD 2-Clause "Simplified" License
18 stars 24 forks source link

Example sought for cit/def #13

Closed ttasovac closed 4 years ago

ttasovac commented 5 years ago

In the narrative, we say:

TEI Lex-0 allows the use of def in sense, cit and etym only. All other existing contexts would be implemented by embedding def within a sense.

Our schema reflects that.

I would like to have an example of with cit/def. Could somebody find one and share here?

xlhrld commented 5 years ago

Maybe this snippet from Mueller (1879):

Lobby vorhalle; altengl. lobie, mlat. lobia, laubia, lobium: „porticus operta ad spatiandum idonea, aedibus adjuncta, galerie, ex laub teuton. folium, quod ejus modi deambulatoria in praediis rusticis foliis obducantur et operiantur“ Ducange; […]

This should be marked-up along the lines of:

<entry xml:id="...">
    <form><orth>Lobby</orth></form>
    <sense>
        <def>vorhalle</def>
        <cit type="etym">
            <lang>altengl.</lang>
            <form xml:lang="en"><orth>lobie</orth></form>
        </cit>
        <cit type="etym">
            <lang>mlat.</lang>
            <form xml:lang="la"><orth>lobia</orth></form>
            <form xml:lang="la"><orth>laubia</orth></form>
            <form xml:lang="la"><orth>lobium</orth></form>
            <cit type="quote">
                <def xml:lang="la">porticus operta ad spatiandum idonea, aedibus adjuncta, galerie, ex laub teuton. folium, quod ejus modi deambulatoria in praediis rusticis foliis obducantur et operiantur</def>
                <bibl><author>Ducange</author></bibl>
            </cit>
        </cit>
    </sense>
</entry>

It would be a pity to have the Latin definition cited from Ducange marked-up only as <quote> (which it formally is, of course, but it really serves as a definition here).

xlhrld commented 5 years ago

Also, in cit/@type="etymon and cit/@type="cognate" it's common to have def like so:

<cit type="cognate" xml:lang="de">
  <lang>bret.</lang>
  <form>
    <orth lang="br">braô</orth><pc>,</pc>
    <orth lang="br">brav</orth>
  </form>
  <def>schön, lieblich</def>
</cit>
kdepuydt commented 5 years ago

We do not code defs in quoted dictionaries because they are not defs “within the dictionary we are dealing with”. If you do encode them, there should be a clear and simple way of determining they are NOT definitions of the current headword in the current dictionary

ttasovac commented 4 years ago

This will actually become clear once we integrate the etym section from the paper that Laurent, Axel, Jack and I are working on.

We have since become stricter about def: we don't actually want to allow cit/def in etymologies, but go rather for cit/gloss. Because these "definitions" in etymologies are not proper lexicographic definitions, but rather a kind of semantic shorthand signaling what the meaning of the word is, but not defining it fully... Our guidelines now say that should be used only within a .

anacastrosalgado commented 4 years ago

@ttasovac, if you need an example of gloss close to a definition: [https://hal.inria.fr/hal-02618067]

<etym type=”borrowing”>
   <lbl>etim</lbl>
   <cit type=”etymon”>
      <lang expand=”francês” norm=”fr”>fr.</lang>
      <form xml:lang=”fr”>
         <orth>opaline</orth>
      </form>
      <bibl type=”attestation”>
         <pc>(</pc><date>1899</date><pc>)</pc>
      </bibl>
      <pc>’</pc>**<gloss>tipo de vaso vitrificado</gloss>**<pc>’</pc>
      <pc>;</pc>
      <xr type=”related”>
         <lbl>ver</lbl>
         <ref type=”entry”>opal(i)-</ref>
      </xr>
      <note>a datação é para a ace., não registada aqui, género de infusórios que se encontra no ventre das `rãs</note>`
      <pc>;</pc>
   </cit>
</etym>
laurentromary commented 4 years ago

From a lexicographic point of view, I would qualify this as a definition (with a clear genus and differentiae) and does not correspond to an equivalent word in the target language (as one would expect for <gloss>). I would thus go for <def> here. @ttasovac : to be recorded in the documentation if we miss examples.

ttasovac commented 3 years ago

hm... I wouldn't go for a def here. A proper definition would be "resembling opal in its iridescence; having a milky white iridescence" or "a clear to white liquid secreted by sea hares (genus Aplysia) that becomes viscous upon contact with water" — to me, this is still a gloss. there is a genus, but really no differentia specifica...

Another thing that is often an indicator of a gloss in etymologies (although, admittedly, that varies from dictionary to dictionary) are the single quotation marks around tipo de vaso vitrificado...

laurentromary commented 3 years ago

Given the two sets of arguments, this is an ambiguous one, i.e. an encoder may decide on one or the other. I mean: imagine putting this as a clear instruction in a documentation? :-}

anacastrosalgado commented 3 years ago

As a lexicographer, I would rather prefer to use always gloss. If you don't, it could lead to big debates about what a 'definition' can be and what a 'gloss' can be. I agree with Toma.

Examples from Lisbon Academy of Sciences dictionary:

Casa [home] Do lat. casa ’cabana’ Convento [convent] Do lat. conventus ’congregação’ Fortificação [fortification] Do lat. fortificatio, -ōnis ’acção de fortificar’ Indolente [indolent] Do lat. indŏlens, -entis ’que não sofre’ What is cabana/congregação? A definition? No... A synonym? It's the meaning (explanation) of the Latin word. "acção de fortificar/que não sofre" can be consider as definition?

In all these examples, I go for "gloss".

"gloss is a textual description of a sense’s meaning meant for human interpretation."

ceolfrid commented 3 years ago

I usually describe the difference this way to lexicographers that ask:

gloss - translation equivalent of a term definition - narrative explanation of the meaning of a term

While the distinction can get fuzzy, as illustrated in this thread, making the distinction is very useful computationally. For instance, matches found in glosses are more indicative of relevance than those found in definitions so I always give them different weights. Also some processes really only want glosses. I had one customer that used our dictionaries to generate bilingual word clouds based on document content. Definitions would have made them really messy, so they only used the glosses. Having those distinctions made that possible.

I don't have a good use case for making the distinction in etymological entries, but I would caution against throwing out the distinction.

On Sun, Jul 5, 2020, 5:23 PM Ana Salgado notifications@github.com wrote:

As a lexicographer, I would rather prefer to use always gloss. If you don't, it could lead to big debates about what a 'definition' can be and what a 'gloss' can be. I agree with Toma.

Examples from Lisbon Academy of Sciences dictionary:

Casa [home] Do lat. casa ’cabana’ Convento [convent] Do lat. conventus ’congregação’ Fortificação [fortification] Do lat. fortificatio, -ōnis ’acção de fortificar’ Indolente [indolent] Do lat. indŏlens, -entis ’que não sofre’ What is cabana/congregação? A definition? No... A synonym? It's the meaning (explanation) of the Latin word. "acção de fortificar/que não sofre" can be consider as definition?

In all these examples, I go for "gloss".

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/DARIAH-ERIC/lexicalresources/issues/13#issuecomment-653940975, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALG2TBCBKZ6HRALBI5XX5Y3R2DVO5ANCNFSM4FS2PZPQ .

anacastrosalgado commented 3 years ago

@ceolfrid Thanks for your comment. I really understand what you are saying. The problem is that several theories also use the term gloss for the different meanings of a lexicographic entry. I want to a distinction between what I have in "definition" and the information that explains the etymology of an entry or, as you say, "an equivalent of the term". So I'm using "definition" only for "senses" and I use "gloss" only in etymology field.

When we compare different Portuguese dictionaries, we have, for example:

melífluo [mellifluous] Do latim melliflŭu-, «idem» (Infopédia, Porto Editora) melífluo [mellifluous] lat. melliflŭus, a, um 'de onde corre o mel' (Houaiss)

I think for encoding purposes is better to consider «idem» and 'de onde corre o mel' as . Why do you need to make a distinction? Won't this hinder interoperability?

The «idem» in Infopédia means that the meaning is the same in Latin and in Portuguese (at least the original meaning, the literal). This information may not be beneficial for the user, but for the moment, it isn't the issue. In Houaiss, I know that the information is close to a definition, but we a lot of different criteria in these cases. Sometimes, lexicographers use synonyms, other times definitions...

Cf. OED (Oxford English Dictionary) - 'flowing with honey' is the same from Houaiss ('de onde corre o mel') mellifluous, adj.

Origin: A borrowing from Latin, combined with an English element. Etymons: Latin mellifluus , -ous suffix. Etymology: < post-classical Latin mellifluus sweet as honey, flowing with honey (late 4th cent.; < classical Latin mell- , mel honey (see mell n.2) + -fluus < fluere to flow: see fluent adj.) + -ous suffix. Compare ancient Greek μελίρρυτος flowing with honey. Compare earlier melliflue adj. Post-classical Latin mellifluus is used to designate St Bernard from at least the mid 14th cent.; with the phrase the mellifluous doctor (see quot. 1483 at sense 2b) compare post-classical Latin mellifluus doctor (1535 or earlier).

When I used the "term" gloss I was thinking about the senses. "gloss is a textual description of a sense’s meaning meant for human interpretation."

I think I have to work on the definitions to make a clear distinction between the two terms: gloss and definition. @laurentromary and @ttasovac, do you have some guess? Do you understand what I tried to explain?

OED (original meaning of gloss): A word inserted between the lines or in the margin as an explanatory equivalent of a foreign or otherwise difficult word in the text; hence applied to a similar explanatory rendering of a word given in a glossary or dictionary. Also, in a wider sense, a comment, explanation, interpretation. Often used in a sinister sense: A sophistical or disingenuous interpretation.