kovvalsky / prove_SICK_NL

Prove Ducth NLI problems of SICK-NL with LangPro
MIT License
3 stars 0 forks source link

Issues with OpenDutchWordNet #6

Open kovvalsky opened 3 years ago

kovvalsky commented 3 years ago

:heavy_check_mark: A hypernymy loop in ODWN

In ODWN:

s('eng-30-13920835-n', _, 'staat', 'n', 1, _).
s('eng-30-13920835-n', _, 'gesteldheid', 'n', 1, _).
s('eng-30-13920835-n', _, 'conditie', 'n', 4, _).

s('eng-30-00024720-n', _, 'kwaliteit', 'n', 3, _).
s('eng-30-00024720-n', _, 'boel', 'n', 2, _).
s('eng-30-00024720-n', _, 'situatie', 'n', 1, _).
s('eng-30-00024720-n', _, 'hoedanigheid', 'n', 2, _).
s('eng-30-00024720-n', _, 'bedoening', 'n', 2, _).

hyp('eng-30-13920835-n', 'eng-30-00024720-n').
hyp('eng-30-00024720-n', 'eng-30-13920835-n').

In Princeton WN3.0 there is no such loop:

s(100024720,1,'state',n,2,39).

s(113920835,1,'condition',n,1,72).
s(113920835,2,'status',n,2,1).

hyp(113920835,100024720).

:heavy_check_mark: Solved by banning synset revisiting during the transitive traversal.

kovvalsky commented 3 years ago

:heavy_check_mark: isa(vrouw, man)

In ODWN man has a sense of person

s('eng-30-00007846-n', _, 'individu', 'n', 3, _).
s('eng-30-00007846-n', _, 'deze of gene', 'n', 1, _).
s('eng-30-00007846-n', _, 'mens', 'n', 3, _).
s('eng-30-00007846-n', _, 'iemand', 'n', 1, _).
s('eng-30-00007846-n', _, 'man', 'n', 3, _).
s('eng-30-00007846-n', _, 'persoon', 'n', 1, _).
s('eng-30-00007846-n', _, 'figuur', 'n', 4, _).

this causes vrouw senses eng-30-10787470-n, eng-30-10780632-n, odwn-10-102902918-n to be a hyponym of man eng-30-00007846-n. This relation is unfortunate for SICK as it assumes that man and woman are disjoint concepts. So, 5461: Een vrouw speelt op de fluit <NEUTRAL> Een man speelt op een fluit becomes entailmnet.

This issue is due to mapping sense man-n-3 to eng-30-00007846-n instead of man-n-3 10289039 of EN: (the generic use of the word to refer to any human being) "it was every man for himself", which is a hyponym of person-n-1 00007846 and doesn't subsume any senses.

:heavy_check_mark: Solved by applying a patch that bans man-n-3 sense in ODWN.

kovvalsky commented 3 years ago

:heavy_check_mark: boom should be a hyponym of plant

211: Entailment
Twee honden spelen bij een boom
Twee honden spelen bij een plant

While eng-30-13104059-n: boom.n.1 is mapped to 13104059: tree.n.1, there is no proper mapping to 00017222: plant.n.2:

s('eng-30-00017222-n', _, 'gewas', 'n', 4, _).
s('eng-30-00017222-n', _, 'vegetatie', 'n', 2, _).
s('eng-30-00017222-n', _, 'plantenleven', 'n', 2, _).

has no plant.n.2in it while it is in a synset that corresponds to 12212361: vegetable.n.2.

:heavy_check_mark: Solved by adding hyp('eng-30-13104059-n', 'eng-30-12212361-n') relation. Probably we can undo this as training could learn this relation easily,

kovvalsky commented 3 years ago

Where is sense n.1 for iets?

s('eng-30-13740168-n', _, 'iets', 'n', 3, _).
s('eng-30-00001740-n', _, 'iets', 'n', 2, _).
kovvalsky commented 3 years ago

Missing halter ISA gewicht

(1807):[2898]p De man tilt halters op
(1810):[2898]h De man tilt gewichten op

In WN3.0, barbell.n.1 ISA weight.n.2. While halter.n.2 is mapped to barbell.n.1, there is no gewicht sense mapped to weight.n.2.

kovvalsky commented 3 years ago

Not found pizza isa voedsel. This might upset Italians :it:

4114: N-(C)-[CONTRADICTION] P:trg: Er is geen man die voedsel eet src: There is no man eating food H:trg: Een man eet een pizza src: A man is eating a pizza

kovvalsky commented 3 years ago

Weird one: garnaal ISA persoon (relevant to SICK_NL-3370)

s('odwn-10-100393382-n', _, 'garnaal', 'n', 2, _).
s('eng-30-00007846-n', _, 'persoon', 'n', 1, _).
hyp('odwn-10-100393382-n', 'eng-30-00007846-n').

What is garnaal.n.02?