hltfbk / Excitement-Open-Platform

Excitement Open Platform for Recognizing Textual Entailments
http://hltfbk.github.io/Excitement-Open-Platform/
86 stars 74 forks source link

Treetagger installation issue #507

Closed bakfire07 closed 9 years ago

bakfire07 commented 9 years ago

After following the steps to install treetagger for EOP, it is not functioning for models which require parsing. I did all the stuff that is mentioned on the site (build->accept license->match the version-> rebuild it). Even though the build was successful I am still getting errors while testing the models. I was trying to run the following:

java -Djava.ext.dirs=../EOP-1.2.0/ eu.excitementproject.eop.util.runner.EOPRunner -config ./eop-resources-1.2.0/configuration-files/MaxEntClassificationEDA_Base+VO+TP+TPPos+TS_EN.xml -test -testFile $ExcitmentTest_File -output ./eop-resources-1.2.0/results/

I am pasting the error thrown while executing the EOP at the bottom.

====My system info =======

Apache Ant(TM) version 1.9.3 compiled on April 8 2014 Apache Maven 3.0.5 Java version: 1.7.0_65, vendor: Oracle Corporation Java home: /usr/lib/jvm/java-7-openjdk-amd64/jre Default locale: en_US, platform encoding: UTF-8 OS name: "linux", version: "3.13.0-40-generic", arch: "amd64", family: "unix"

=====ERROR INFO===== 15/02/02 16:23:27 INFO runner.EOPRunner: running the EOP 15/02/02 16:23:27 INFO runner.EOPRunner: Configuration file: ./eop-resources-1.2.0/configuration-files/MaxEntClassificationEDA_Base+VO+TP+TPPos+TS_EN.xml

15/02/02 16:23:27 INFO runner.EOPRunner: Initializing EDA from file /home/bakfire07/Excitement-Open-Platform-1.2.0/target/EOP-1.2.0/./eop-resources-1.2.0/configuration-files/MaxEntClassificationEDA_Base+VO+TP+TPPos+TS_EN.xml 15/02/02 16:23:27 INFO runner.ConfigFileUtils:getAttribute: EDA class name from config file: eu.excitementproject.eop.core.MaxEntClassificationEDA 15/02/02 16:23:27 INFO runner.EOPRunner: EDA object created from class class eu.excitementproject.eop.core.MaxEntClassificationEDA 15/02/02 16:23:27 INFO runner.ConfigFileUtils:getAttribute: Looking for a value for attribute: language 15/02/02 16:23:27 INFO runner.ConfigFileUtils:getAttribute: Value for attribute language : EN 15/02/02 16:23:27 INFO runner.ConfigFileUtils:getAttribute: Looking for a value for attribute: language 15/02/02 16:23:27 INFO runner.ConfigFileUtils:getAttribute: Value for attribute language : EN 15/02/02 16:23:27 INFO runner.ConfigFileUtils:getAttribute: Looking for a value for attribute: activatedLAP 15/02/02 16:23:27 INFO runner.ConfigFileUtils:getAttribute: Value for attribute activatedLAP : eu.excitementproject.eop.lap.dkpro.MaltParserEN 15/02/02 16:23:27 INFO runner.LAPRunner: LAP initialized from class eu.excitementproject.eop.lap.dkpro.MaltParserEN 15/02/02 16:23:28 INFO runner.ConfigFileUtils:getAttribute: Looking for a value for attribute: testDir 15/02/02 16:23:28 INFO runner.ConfigFileUtils:getAttribute: Value for attribute testDir : /tmp/EN/test/ 15/02/02 16:23:28 INFO runner.EOPRunner: testing file: /home/bakfire07/workspace/Excitment/src/hyp-textPair.xml testing dir: /tmp/EN/test/ 15/02/02 16:23:28 INFO runner.LAPRunner: Running lap on file: /home/bakfire07/workspace/Excitment/src/hyp-textPair.xml // writing output to directory /tmp/EN/test/ 15/02/02 16:23:29 INFO opennlp.OpenNlpSegmenter$1: Producing resource from jar:file:/home/bakfire07/Excitement-Open-Platform-1.2.0/target/EOP-1.2.0/de.tudarmstadt.ukp.dkpro.core.opennlp-model-sentence-en-maxent-20120616.0.jar!/de/tudarmstadt/ukp/dkpro/core/opennlp/lib/sentence-en-maxent.bin 15/02/02 16:23:29 INFO opennlp.OpenNlpSegmenter$2: Producing resource from jar:file:/home/bakfire07/Excitement-Open-Platform-1.2.0/target/EOP-1.2.0/de.tudarmstadt.ukp.dkpro.core.opennlp-model-token-en-maxent-20120616.0.jar!/de/tudarmstadt/ukp/dkpro/core/opennlp/lib/token-en-maxent.bin Feb 02, 2015 4:23:29 PM org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl callAnalysisComponentProcess(407) SEVERE: Exception occurred org.apache.uima.analysis_engine.AnalysisEngineProcessException at de.tudarmstadt.ukp.dkpro.core.treetagger.TreeTaggerPosLemmaTT4J.process(TreeTaggerPosLemmaTT4J.java:206) at org.apache.uima.analysis_component.CasAnnotator_ImplBase.process(CasAnnotator_ImplBase.java:56) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:375) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:296) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:567) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.(ASB_impl.java:409) at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:342) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267) at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267) at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:280) at eu.excitementproject.eop.lap.implbase.LAP_ImplBaseAE.addAnnotationOn(LAP_ImplBaseAE.java:167) at eu.excitementproject.eop.lap.implbase.LAP_ImplBase.processRawInputFormat(LAP_ImplBase.java:142) at eu.excitementproject.eop.util.runner.LAPRunner.runLAPOnFile(LAPRunner.java:194) at eu.excitementproject.eop.util.runner.EOPRunner.run(EOPRunner.java:357) at eu.excitementproject.eop.util.runner.EOPRunner.main(EOPRunner.java:394) Caused by: java.io.IOException: Unable to locate model [en] in the following locations [classpath:/de/tudarmstadt/ukp/dkpro/core/treetagger/lib/tagger-en-little-endian.par]. Make sure the environment variable 'TREETAGGER_HOME' or 'TAGDIR' or the system property 'treetagger.home' point to the TreeTagger installation directory. at de.tudarmstadt.ukp.dkpro.core.treetagger.TreeTaggerTT4JBase$DKProModelResolver.getModel(TreeTaggerTT4JBase.java:378) at de.tudarmstadt.ukp.dkpro.core.treetagger.TreeTaggerTT4JBase$DKProModelResolver.getModel(TreeTaggerTT4JBase.java:290) at org.annolab.tt4j.TreeTaggerWrapper.setModel(TreeTaggerWrapper.java:471) at de.tudarmstadt.ukp.dkpro.core.treetagger.TreeTaggerPosLemmaTT4J.process(TreeTaggerPosLemmaTT4J.java:148) ... 14 more

Feb 02, 2015 4:23:29 PM org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl processAndOutputNewCASes(275) SEVERE: Exception occurred org.apache.uima.analysis_engine.AnalysisEngineProcessException at de.tudarmstadt.ukp.dkpro.core.treetagger.TreeTaggerPosLemmaTT4J.process(TreeTaggerPosLemmaTT4J.java:206) at org.apache.uima.analysis_component.CasAnnotator_ImplBase.process(CasAnnotator_ImplBase.java:56) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:375) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:296) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:567) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.(ASB_impl.java:409) at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:342) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267) at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267) at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:280) at eu.excitementproject.eop.lap.implbase.LAP_ImplBaseAE.addAnnotationOn(LAP_ImplBaseAE.java:167) at eu.excitementproject.eop.lap.implbase.LAP_ImplBase.processRawInputFormat(LAP_ImplBase.java:142) at eu.excitementproject.eop.util.runner.LAPRunner.runLAPOnFile(LAPRunner.java:194) at eu.excitementproject.eop.util.runner.EOPRunner.run(EOPRunner.java:357) at eu.excitementproject.eop.util.runner.EOPRunner.main(EOPRunner.java:394) Caused by: java.io.IOException: Unable to locate model [en] in the following locations [classpath:/de/tudarmstadt/ukp/dkpro/core/treetagger/lib/tagger-en-little-endian.par]. Make sure the environment variable 'TREETAGGER_HOME' or 'TAGDIR' or the system property 'treetagger.home' point to the TreeTagger installation directory. at de.tudarmstadt.ukp.dkpro.core.treetagger.TreeTaggerTT4JBase$DKProModelResolver.getModel(TreeTaggerTT4JBase.java:378) at de.tudarmstadt.ukp.dkpro.core.treetagger.TreeTaggerTT4JBase$DKProModelResolver.getModel(TreeTaggerTT4JBase.java:290) at org.annolab.tt4j.TreeTaggerWrapper.setModel(TreeTaggerWrapper.java:471) at de.tudarmstadt.ukp.dkpro.core.treetagger.TreeTaggerPosLemmaTT4J.process(TreeTaggerPosLemmaTT4J.java:148) ... 14 more

Error running the LAP eu.excitementproject.eop.lap.LAPException: Underlying AE or AAE reported an exception at eu.excitementproject.eop.lap.implbase.LAP_ImplBaseAE.addAnnotationOn(LAP_ImplBaseAE.java:171) at eu.excitementproject.eop.lap.implbase.LAP_ImplBase.processRawInputFormat(LAP_ImplBase.java:142) at eu.excitementproject.eop.util.runner.LAPRunner.runLAPOnFile(LAPRunner.java:194) at eu.excitementproject.eop.util.runner.EOPRunner.run(EOPRunner.java:357) at eu.excitementproject.eop.util.runner.EOPRunner.main(EOPRunner.java:394) Caused by: org.apache.uima.analysis_engine.AnalysisEngineProcessException at de.tudarmstadt.ukp.dkpro.core.treetagger.TreeTaggerPosLemmaTT4J.process(TreeTaggerPosLemmaTT4J.java:206) at org.apache.uima.analysis_component.CasAnnotator_ImplBase.process(CasAnnotator_ImplBase.java:56) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:375) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:296) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:567) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.(ASB_impl.java:409) at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:342) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:267) at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267) at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:280) at eu.excitementproject.eop.lap.implbase.LAP_ImplBaseAE.addAnnotationOn(LAP_ImplBaseAE.java:167) ... 4 more Caused by: java.io.IOException: Unable to locate model [en] in the following locations [classpath:/de/tudarmstadt/ukp/dkpro/core/treetagger/lib/tagger-en-little-endian.par]. Make sure the environment variable 'TREETAGGER_HOME' or 'TAGDIR' or the system property 'treetagger.home' point to the TreeTagger installation directory. at de.tudarmstadt.ukp.dkpro.core.treetagger.TreeTaggerTT4JBase$DKProModelResolver.getModel(TreeTaggerTT4JBase.java:378) at de.tudarmstadt.ukp.dkpro.core.treetagger.TreeTaggerTT4JBase$DKProModelResolver.getModel(TreeTaggerTT4JBase.java:290) at org.annolab.tt4j.TreeTaggerWrapper.setModel(TreeTaggerWrapper.java:471) at de.tudarmstadt.ukp.dkpro.core.treetagger.TreeTaggerPosLemmaTT4J.process(TreeTaggerPosLemmaTT4J.java:148) ... 14 more

ghost commented 9 years ago

Hello,

I have a similar problem. I think it's #368 duplicate, no solution is provided though. I have not tried running TreeTaggerEnTest. How do you run this without setting up eclipse? I do have the file called de.tudarmstadt.ukp.dkpro.core.treetagger-model-en-20111109.0.jar somewhere in ~/.m2, with tagger-en-little-endian.par

Thanks, Tom

rzanoli commented 9 years ago

it seems that there are more and more people who have the same problem when they install TreeTagger with the build.xml file that we are distributing. We don't know if it could depend from their own environment or not (there are also people using TreeTagger with the EOP without any issues), but we think that we would need to check this issue as soon as possible given that TreeTagger is also required when you want to exploit external resources. I would propose to have one or two people working on it. They should try to replicate the issue and try to solve it also asking for the help of our friends working on DKPro who use that script or an updated version for installing TreeTagger. Are there volunteers that would like to contribute to the EOP taking care of this issue?

reckart commented 9 years ago

So the usual problems are:

P: required JAR missing in the local maven repository (typically residing under ~/.m2) S: run build.xml with the "local-maven" target or use a private repository that contains these artifacts (sorry, TreeTagger license prohibits redistribution)

P: required dependency missing in pom.xml S: Add dependency on the artifact containing the desired model, do not forget adding a dependency on the artifact containing the treetagger binary.

P: build.xml does not run anymore due to models missing or having been updated S: fix build.xml, remove outdated models, update checksum for updated models.

P: dependency in pom.xml has wrong version (in particular after fixing the build.xml and properly updating it) S: update the version in the pom.xml

Btw, in more recent versions of DKPro Core, the support for explicitly specifying model and binary paths was improved. Cf.: https://code.google.com/p/dkpro-core-asl/wiki/GroovyTreeTaggerPosTagNoReaderAccessDirect

rzanoli commented 9 years ago

From your point of view, should we update the build.xml file (2010) that we are distributing (it seems the needed files are downloaded from Stuttgart) or should we download one of the new versions of the build.xml file (2014) downloading the files from Muenchen?

Many thanks

reckart commented 9 years ago

As far as I know, you're still using DKPro Core 1.5.0. For this version, I think you should update the DKPro Core 1.5.0 build.xml to the new URLs from München, update the hashes and the versions (and update your pom.xml files accordingly). I actually thought that you had done this in the past.

The packaging of models has changed somewhat in more recent versions of DKPro Core, so you should not try switching to a more recent version of our build.xml files. Always use the build.xml belonging to the version of DKPro Core that you are using. Only update to a newer one if you also update your DKPro Core dependencies.

rzanoli commented 9 years ago

We would need to move to DKPro 1.5.0 also considering that we have an issue with it-stein-tagger.map that is distributed as part of de.tudarmstadt.ukp.dkpro.core.api.lexmorph-asl-1.4.0.jar (it maps all the part of speech with POS) but currently we are still using DKPro 1.4.0.

reckart commented 9 years ago

Ok, then you should stick with the build.xml from DKPro Core 1.4.0 and update that.

If you care to update DKPro Core, you should consider switching to the latest release: DKPro Core 1.7.0.

rzanoli commented 9 years ago

Thanks a lot, we'll update the build.xml file to be included in the next incoming EOP release. Then, we'll try to switch to DKPro 1.7.0

rzanoli commented 9 years ago

The build.xml file for installing TreeTagger has been updated while the script install.sh for installing the EOP allows the users to install TreeTagger too after they read and agree with its licence. These changes are available with EOP 1.2.1.