dkpro / dkpro-uby

Framework for creating and accessing UBY resources – sense-linked lexical resources in standard UBY-LMF format
https://dkpro.github.io/dkpro-uby
Other
22 stars 3 forks source link

Upgrade GermaNet converter to GermaNet 8.0 #42

Closed judithek closed 9 years ago

judithek commented 9 years ago
uby.integration.germanet-gpl needs to upgraded to the new GermaNet 8.0

Original issue reported on code.google.com by eckle.kohler on 2013-05-03 17:03:35

judithek commented 9 years ago
(No text was entered with this change)

Original issue reported on code.google.com by eckle.kohler on 2013-05-21 18:19:50

judithek commented 9 years ago
GermaNet 8.0 appears to have more Subcat-Frames (210 compared to 204 in 7.0)

Mapping needs to be updated

Original issue reported on code.google.com by eckle.kohler on 2013-12-13 08:55:07

judithek commented 9 years ago
comparison of GN6 and GN8:

only in v6.0:
NE.AR.AZ
NN.AN.BC
NN.AN.Pp.Bs

only in v8.0
NE.PP.DS
NN.AN.An
NN.AN.Pp.Bd
NN.AR.Bo.Bm
NN.AR.Gn
NN.AR.Pp.Bo
NN.Ba
NN.BO.Pp
NN.DN.AZ
NN.PP.Bl
NN.PP.Bo

Original issue reported on code.google.com by eckle.kohler on 2013-12-13 09:58:11

judithek commented 9 years ago
The converter does not convert GermaNet 8.0 INCLUDING the Interlingual Index.

This seems to be due to a bug in the GermaNet API in the class IliLoader:

the method processIliRecord throws a NumberFormatException for one of the following
two String to Integer conversions: 

        lexUnitId = Integer.valueOf(parser.getAttributeValue(namespace, GermaNet.XML_LEX_UNIT_ID).substring(1));

        pwn20Sense = Integer.valueOf(parser.getAttributeValue(namespace, GermaNet.XML_PWN20_SENSE));

The information on pwn20Sense seems to be not present at all in the XML file interLingualIndex_DE-EN.xml

that is parsed here.

So an upgrade to GermaNet 8.0 does not make sense until this issue is fixed. Either
in Tübingen or by creating a patch.

Original issue reported on code.google.com by eckle.kohler on 2014-03-26 07:07:52

judithek commented 9 years ago
The reported bug in the GermaNet API turned out to be a Maven problem where a locally
cached version of version 7 of the GermaNet API was used instead of v8.

However, the issue remains that the converter does not create any SenseAxis instances.
So the GermaNet ILI is loaded but the conversion to UBY-LMF fails.

Original issue reported on code.google.com by eckle.kohler on 2014-03-26 08:46:23

judithek commented 9 years ago
(No text was entered with this change)

Original issue reported on code.google.com by eckle.kohler on 2014-06-07 14:08:38