jmccrae / gwn-scala-api

API for working with GWN formats
Apache License 2.0
9 stars 0 forks source link

Pertainym and Derivation in WNDB #21

Closed goodmami closed 3 years ago

goodmami commented 3 years ago

In the WNDB docs it lists the following pointers and relations:

It's not entirely clear what is the difference, but it seems that derivation is for nouns and verbs and pertainym is for adjectives and adverbs, but pertainym can also be for cases that are not strictly derivational, like lunar \ moon. In any case, the code for reading/writing WNDB databases has some simple problems:

In wndb.scala I see the following for reading:

https://github.com/jmccrae/gwn-scala-api/blob/db00cd3dfbf1ba02c02d1eaa238d90c7cd64f59d/src/main/scala/org/globalwordnet/wnapi/wndb.scala#L456

and:

https://github.com/jmccrae/gwn-scala-api/blob/db00cd3dfbf1ba02c02d1eaa238d90c7cd64f59d/src/main/scala/org/globalwordnet/wnapi/wndb.scala#L469-L473

The second snippet above would be clearer if it combined the if and else if conditions since the result is the same. It might also be incorrect to have the `pos == "n" condition as nouns don't have pertainyms in PWN 3.0, but maybe they do in 3.1 or other wordnets (I haven't checked)?

For writing WNDB databases it has:

https://github.com/jmccrae/gwn-scala-api/blob/db00cd3dfbf1ba02c02d1eaa238d90c7cd64f59d/src/main/scala/org/globalwordnet/wnapi/wndb.scala#L492

and later:

https://github.com/jmccrae/gwn-scala-api/blob/db00cd3dfbf1ba02c02d1eaa238d90c7cd64f59d/src/main/scala/org/globalwordnet/wnapi/wndb.scala#L505-L506

In the second snippet here, the last else if(r == derivation) block will never run because it was preempted by the first one. I think it can be removed, or otherwise the first one needs a secondary condition.

But what is the expected behavior? It currently looks like:

  1. +derivation
  2. \pertainym but throws an error when \ is on a verb
jmccrae commented 3 years ago

I guess at some point \ on an adverb was mapped to derivation not pertainym. This can all be simplified now.