dkpro / dkpro-similarity

Word and text similarity measures
https://dkpro.github.io/dkpro-similarity
Other
53 stars 22 forks source link

A question about the resource graph of Wiktionary #49

Closed tutubalinaev closed 7 years ago

tutubalinaev commented 8 years ago

Hello! The section "Lexical Semantic Resources for Word Aggregation Measures" contains the following phrase: "The resource graphs can be downloaded here: Wiktionary [link] and WordNet [link]". There is no information how two files were created.
I thought that you used JWKTL.parseWiktionaryDump(dumpFile, outputDirectory, overwriteExisting), so I created the updated Wiktionary file. However, the script "LexicalSemanticResource wordnet = ResourceFactory.getInstance().get("wiktionary", "ru")" caused the error "Exception in thread "main" org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'wiktionary-ru' defined in URL [file:/.../DKPRO/de.tudarmstadt.ukp.dkpro.lexsemresource.core.ResourceFactory/resources.xml]: Instantiation of bean failed; nested exception is org.springframework.beans.BeanInstantiationException: Could not instantiate bean class [de.tudarmstadt.ukp.dkpro.lexsemresource.wiktionary.WiktionaryResource]: Constructor threw exception; nested exception is de.tudarmstadt.ukp.wiktionary.api.WiktionaryException: Unable to establish a db connection; Caused by: com.sleepycat.persist.IndexNotAvailableException: (JE 5.0.73) PrimaryIndex not yet available on this Replica, entity class: de.tudarmstadt.ukp.wiktionary.api.entry.WiktionaryPage".

Could you please describe how to create the graphs of the Wiktionary dump for Word Aggregation Measures?

nicolaierbs commented 8 years ago

Hi, the resources were created using DKPro LSR (https://github.com/dkpro/dkpro-lsr/). I recommend you to use the existing resources, or to adapt the code to work with UBY (https://dkpro.github.io/dkpro-uby/). In the future, we plan to extend DKPro Similarity with a general interface to UBY.

tutubalinaev commented 8 years ago

Hi, thank you for your response. However, the UBY databases don't include Wiktionary Russian. How can I use this dump of Wiktionary if I can't create a new resource or work with UBY?

reckart commented 8 years ago

That questions is probably best asked on the Uby mailing list: https://groups.google.com/forum/#!forum/uby-users

tutubalinaev commented 8 years ago

No, it's not. My main question is about dkpro-similarity. How can I use dkpro-similarity's functions if I can't create the dump of Wiktionary? I can't use the UBY databases or create the dumps due to the described error.

reckart commented 8 years ago

Hm, I see. Last time I used DKPro Similarity it didn't even have that name yet ;)

If you have created a resource for a new language in a format that would be compatible with DKPro Similarity, you'll have to register it in the resources.xml file (cf. https://public.ukp.informatik.tu-darmstadt.de/dkprosimilarity/resources.xml)

Maybe try adding a section like

<bean id="wiktionary-ru" lazy-init="true" class="de.tudarmstadt.ukp.dkpro.lexsemresource.wiktionary.WiktionaryResource">
<constructor-arg value="RUSSIAN"/>
<constructor-arg value="${DKPRO_HOME}/LexSemResources/Wiktionary/jwktl_SOMEVERION_ruSOMEDATE"/>
</bean>

Replace SOMEVERSION and SOMEDATE with the values for your system or even set the value of the second constructor arg to another folder where you have stored your resource.

That should at least address this error:

 Error creating bean with name 'wiktionary-ru' defined in URL [file:/.../DKPRO/de.tudarmstadt.ukp.dkpro.lexsemresource.core.ResourceFactory/resources.xml]: