DARIAH-ERIC / lexicalresources

Data space of the DARIAH Lexical Resources Working Group
https://dariah-eric.github.io/lexicalresources/
BSD 2-Clause "Simplified" License
18 stars 24 forks source link

TEI Lex-0 guidelines v. 0.8.5./Grammatical properties/Typology of gram #135

Open anacastrosalgado opened 3 years ago

anacastrosalgado commented 3 years ago

Bom dia, @ttasovac e @laurentromary !

When I look into the grammatical labels defined in TEI Lex-0 guidelines (pos, case, gen, iType, mood, number, per, tns), I can't find any that specify the subclass of certain word classes (part of speech). Is it possible to add a new attribute value for these cases?

E.g. adv. af.; adv. conf. (examples retrieved from VOLP-1940 [Orthographic Vocabulary of the Portuguese Language, Academia das Ciências de Lisboa] list of abbreviations):

I saw that I can use (https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-subc.html). Do you agree?

Part of speech: Open class words adj. (adjectivo)

adj.

adv. (advérbio)

adv.

[Adverbs are grouped according to their function and value (subclasses) – following the traditional Portuguese grammar classification that is outdated] adv. af. (advérbio de afirmação) adv. conf. (advérbio de confirmação) adv. design. (advérbio de designação) adv. dúv. (advérbio de dúvida) adv. excl. (advérbio de exclusão) adv. interr. (advérbio interrogativo) adv. lug. (advérbio de lugar) adv. mod. (advérbio de modo) adv. neg. (advérbio de negação) adv. num. (advérbio numeral) adv. rel. (advérbio relativo) adv. temp. (advérbio de tempo) interj. (interjeição)

interj.

[The same for interjections] interj. excl. (interjeição exclamativa) interj. voc. (interjeição vocativa)

...

Obrigada, Ana

anacastrosalgado commented 3 years ago

Proposal: <gram type="pos" norm="ADV">adv.</gram> <gram type="subc" expand="de afirmação">af.</gram>

daliboris commented 3 years ago

The recommended way in TEI Lex-0 is using <gram type="" /> instead of <pos>, <case>..., See chapter 2.3.2:

Considering the goals of TEI Lex-0 to serve as a common baseline and target format for transforming and comparing different lexical resources, we have decided to do away with the specific elements for grammatical properties. Instead, we recommend the use of typed elements. This is a decision that wasn't taken lightly and one which solicited a great deal of discussion. ... The attribute values for gram/@type are a semi-closed list: this means that we will discuss and adopt additional values as demonstrated by examples from dictionaries that are encoded by members of our community.

My proposal: <gram type="pos" norm="ADV">adv.</gram> <gram type="pos-sub" expand="de afirmação">af.</gram>

If you use just @type="subc", main category/type will depend on the value of the previous <gram> element.

For example: <gram type="tense">aorist</gram> <gram type="tense-sub">asigmatic</gram>

anacastrosalgado commented 3 years ago

Hi, @daliboris ! You're right. My mistake! If Toma agrees, I will use: <gram type="pos-sub" expand="de afirmação">af.</gram>

Thanks.

anacastrosalgado commented 3 years ago

@daliboris You say: If you use just @type="subc", main category/type will depend on the value of the previous element. But if I use "sub" it always depends on the previous value... I think...

daliboris commented 3 years ago

Hi @anacastrosalgado, you're right.

But I'm also thinking about querying/finding this value.

1) If you want to find all subcategories for POS, for example, in your case an XPath query would be //gram[@type='subc'][preceding-sibling::gram[1][@type='pos']]. In my case it would be "just" //gram[@type='pos-sub']. (If you want to find all categories and subcategories you can use //gram[starts-with(@type, 'pos')].)

2) If you want to find all subcategories for adverbs, you can add just [@norm='ADV'] to "your" XPath: //gram[@type='subc'][preceding-sibling::gram[1][@type='pos'][@norm='ADV']].

In my case, the query will be simillar (and complicated) to yours: //gram[@type, 'pos-sub'][preceding-sibling::gram[1][@norm='ADV']].


But wait, why we don't use the @subtype attribute, which is available for the <gram> element too?

<gram type="pos" norm="ADV">adv.</gram>
<gram type="pos" subtype="subcategory" expand="de afirmação">af.</gram>

Then you can query your dictionary with XPath like this:

Cons:

daliboris commented 3 years ago

Sorry, after "close reading" of the specification TEI Lex-0 version 0.8.5, I noticed an example of using <gram type="subc"> element (instead of <subc> from the TEI Guidelines):

<gramGrp>
 <gram type="pos">vt</gram>
 <gram type="subc">VP2A</gram>
</gramGrp>

It's obvious that your initial proposal, @anacastrosalgado, follows/copies current TEI Lex-0 Guidelines.