globalwordnet / schemas

WordNet-LMF formats
https://globalwordnet.github.io/schemas/
19 stars 11 forks source link

addition of grammaticalGender to ATTLIST for Lexical Entry and Form? #64

Open anasfkhan81 opened 2 years ago

anasfkhan81 commented 2 years ago

Would be useful for languages with gender (in languages like Italian the same Lexical Entry can have singular and plural forms of different genders so it would be good to have this in the ATTLIST for Form too)....(I also think part of speech would make more sense as part of the ATTLIST for Lexical Entry if we start to add grammatical attributes at the level of Lexical Entry)

jmccrae commented 2 years ago

This seems very reasonable. Do we have any examples of existing wordnets where we could base this from?

goodmami commented 2 years ago

Something like this would be very useful. My initial reaction is that an attribute for gender features isn't great because many languages do not use grammatical gender, but then again we have non-universal things like adjposition and partOfSpeech (and its limited values). But also, what about other features, like number, person, tense, aspect, etc...

Currently we can use <Tag>:

<LexicalEntry id="oewn-goose-n">
  <Lemma writtenForm="goose" partOfSpeech="n">
    <Tag category="NUM">sg</Tag>
  </Lemma>
  <Form writtenForm="geese">
    <Tag category="NUM">pl</Tag>
  </Form>
  ...

But this does seem excessively verbose, and would be worse with multiple features. It might be nice to have something that encodes multiple features in some regular way on a single attribute, e.g., following the features of Unimorph:

  <LexicalEntry id="oewn-sleep-v">
    <Lemma writtenForm="sleep" partOfSpeech="v" />
    <Form writtenForm="sleeps" features="3;SG;PRS;IND" />
    <Form writtenForm="slept" features="PST" />
  ...