Closed loretoparisi closed 3 years ago
In this configuration, you are using the English annotators. The correct list of annotators is:
annotators=ita_toksent, pos, ita_morpho, ita_lemma, ner
It is weird that you get the DATE
s, as they are not included in the Italian model (unless you are using an English text and the English model). If you want to use HeidelTime, you should add:
customAnnotatorClass.timex=eu.fbk.dh.tint.heideltime.annotator.HeidelTimeAnnotator
annotators=..., timex
timex.treeTaggerHome=path/to/tagger-scripts
timex.considerDate=true
timex.considerDuration=true
timex.considerSet=true
timex.considerTime=true
timex.typeSystemHome=desc/type/HeidelTime_TypeSystem.xml
timex.typeSystemHome_DKPro=desc/type/DKPro_TypeSystem.xml
timex.uimaVarDate=Date
timex.uimaVarDuration=Duration
timex.uimaVarLanguage=Language
timex.uimaVarSet=Set
timex.uimaVarTime=Time
timex.uimaVarTypeToProcess=Type
timex.uimaVarTemponym = Temponym
timex.considerTemponym = false
timex.chineseTokenizerPath=
where path/to/tagger-scripts
is the path where you installed TreeTagger. You must leave the last one, timex.chineseTokenizerPath
, even if it's blank, otherwise HeidelTime crashes.
To run the geocoder, you should have a local installation of Nominatim, or you can use the public one. The configuration is:
customAnnotatorClass.geoloc=eu.fbk.dh.tint.geoloc.annotator.GeolocAnnotator
annotators=..., geoloc
geoloc.geocoder_url=/path/to/nominatim
where /path/to/nominatim
is the URL of Nominatim. By default, the GeolocAnnotator
uses the Nominatim public one, that is slow and limited. If you use a local version, you can add a geoloc.use_local_geocoder
boolean setting to skip the timeout. You can also set a geoloc.timeout
option (in milliseconds), that works only when geoloc.use_local_geocoder
is enabled (otherwise it is 1 second).
If you launch Tint using the included runner, all the customAnnotatorClass
es are already set up correctly.
@ziorufus Thank you. Is the TreeTagger
necessary for the italian tagger? I'm using CoreNLP
with default models and by language models like fr,zh,de,es,ar (custom models in the jar files for each language then).
If I use as annotators "ita_toksent,ita_lemma,ita_morpho,ssplit,pos,ner"
the JVM complains it's missing the class PropertiesUtils
:
15:02:51.158 [main] INFO e.s.nlp.pipeline.StanfordCoreNLP - Registering annotator ita_toksent with class eu.fbk.dh.tint.tokenizer.annotators.ItalianTokenizerAnnotator
15:02:51.161 [main] INFO e.s.nlp.pipeline.StanfordCoreNLP - Registering annotator timex with class eu.fbk.dh.tint.heideltime.annotator.HeidelTimeAnnotator
15:02:51.161 [main] INFO e.s.nlp.pipeline.StanfordCoreNLP - Registering annotator ita_morpho with class eu.fbk.dh.tint.digimorph.annotator.DigiMorphAnnotator
15:02:51.161 [main] INFO e.s.nlp.pipeline.StanfordCoreNLP - Registering annotator ita_lemma with class eu.fbk.dh.tint.digimorph.annotator.DigiLemmaAnnotator
15:02:51.162 [main] INFO e.s.nlp.pipeline.StanfordCoreNLP - Adding annotator ita_toksent
{ Error: Error creating class
edu.stanford.nlp.util.MetaClass$ClassCreationException: MetaClass couldn't create public eu.fbk.dh.tint.tokenizer.annotators.ItalianTokenizerAnnotator(java.lang.String,java.util.Properties) with args [ita_toksent, {tokenize.language=de, ssplit.newlineIsSentenceBreak=always, lang=it, annotators=ita_toksent,ita_lemma,ita_morpho,ssplit,pos,ner, depparse.model=/root/parser-model-1.txt.gz, customAnnotatorClass.ita_toksent=eu.fbk.dh.tint.tokenizer.annotators.ItalianTokenizerAnnotator, customAnnotatorClass.timex=eu.fbk.dh.tint.heideltime.annotator.HeidelTimeAnnotator, pos.model=/root/italian-fast.tagger, parse.model=edu/stanford/nlp/models/srparser/germanSR.ser.gz, ner.useSUTime=0, customAnnotatorClass.ita_morpho=eu.fbk.dh.tint.digimorph.annotator.DigiMorphAnnotator, customAnnotatorClass.ita_lemma=eu.fbk.dh.tint.digimorph.annotator.DigiLemmaAnnotator, DATAROOT=/Users/loretoparisi/Dropbox (musixmatch)/Development/data/data/stanford, ner.model=/root/ner-ita-nogpe-noiob_gaz_wikipedia_sloppy.ser.gz}]
at edu.stanford.nlp.util.MetaClass$ClassFactory.createInstance(MetaClass.java:237)
at edu.stanford.nlp.util.MetaClass.createInstance(MetaClass.java:382)
at edu.stanford.nlp.pipeline.AnnotatorImplementations.custom(AnnotatorImplementations.java:143)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$registerCustomAnnotators$66(StanfordCoreNLP.java:556)
at edu.stanford.nlp.util.Lazy$3.compute(Lazy.java:118)
at edu.stanford.nlp.util.Lazy.get(Lazy.java:31)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:146)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:447)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:150)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:146)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:133)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at edu.stanford.nlp.util.MetaClass$ClassFactory.createInstance(MetaClass.java:233)
... 14 more
Caused by: java.lang.NoClassDefFoundError: eu/fbk/utils/core/PropertiesUtils
at eu.fbk.dh.tint.tokenizer.annotators.ItalianTokenizerAnnotator.<init>(ItalianTokenizerAnnotator.java:29)
... 19 more
Caused by: java.lang.ClassNotFoundException: eu.fbk.utils.core.PropertiesUtils
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 20 more
thank you
TreeTagger is only needed for the HeidelTime annotator. Tint uses the CoreNLP original tagger. Regarding your error, are you using Maven to include the tint-tokenizer
?
@ziorufus Ok got it! ...Nope, I'm including the jar manually in my project. Is that a sub-project? Thank you.
Which JAR are you including? You need to include the jar-with-dependencies
.
So far I have included only:
-rw-r--r-- 1 loretoparisi staff 3534201 17 Lug 12:54 /root/tint-digimorph-0.1.jar
-rw-r--r-- 1 loretoparisi staff 10333 17 Lug 12:47 /root/tint-digimorph-annotator-0.1.jar
-rw-r--r-- 1 loretoparisi staff 8071 17 Lug 15:02 /root/tint-heideltime-annotator-0.1.jar
-rw-r--r-- 1 loretoparisi staff 24003 17 Lug 11:39 /root/tint-tokenizer-0.1.jar
Ah ok I see that it is part of the DKM package. Funny thing I do not find the eu.fbk.utils.core
in the eu.fbk.utils
I mean this one https://mvnrepository.com/artifact/eu.fbk.dkm.utils/utils/1.2
You need to include all the dependencies (recursively). You can find the dependencies in the pom.xml
file, but I suggest you to use the Maven paradigm, otherwise you need to add tens of dependency by hand.
Thank you. I'm actually using Maven to build the project:
[INFO] Reactor Summary:
[INFO]
[INFO] tint ............................................... SUCCESS [ 1.581 s]
[INFO] tint-textpro ....................................... SUCCESS [ 0.477 s]
[INFO] tint-eval .......................................... SUCCESS [ 0.040 s]
[INFO] tint-resources ..................................... SUCCESS [ 0.102 s]
[INFO] tint-digimorph ..................................... SUCCESS [ 0.123 s]
[INFO] tint-digimorph-annotator ........................... SUCCESS [ 0.028 s]
[INFO] tint-tokenizer ..................................... SUCCESS [ 0.031 s]
[INFO] tint-tense ......................................... SUCCESS [ 0.021 s]
[INFO] tint-readability ................................... SUCCESS [ 0.041 s]
[INFO] tint-geoloc-annotator .............................. SUCCESS [ 0.018 s]
[INFO] tint-heideltime-annotator .......................... SUCCESS [ 0.399 s]
[INFO] tint-models ........................................ SUCCESS [ 0.012 s]
[INFO] tint-runner ........................................ SUCCESS [ 0.925 s]
[INFO] tint-kd-annotator .................................. SUCCESS [ 0.016 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
so I get these target jars
[loretoparisi@:mbploreto tint]$ find ./ -name \*.jar
.//target/tint-0.1-tests.jar
.//tint-digimorph/target/tint-digimorph-0.1.jar
.//tint-digimorph-annotator/target/tint-digimorph-annotator-0.1.jar
.//tint-eval/target/tint-eval-0.1.jar
.//tint-geoloc-annotator/target/tint-geoloc-annotator-0.1.jar
.//tint-heideltime-annotator/target/tint-heideltime-annotator-0.1-tests.jar
.//tint-heideltime-annotator/target/tint-heideltime-annotator-0.1.jar
.//tint-kd-annotator/target/tint-kd-annotator-0.1.jar
.//tint-models/target/tint-models-0.1.jar
.//tint-readability/target/tint-readability-0.1.jar
.//tint-resources/target/tint-resources-0.1.jar
.//tint-runner/target/tint-runner-0.1-tests.jar
.//tint-runner/target/tint-runner-0.1.jar
.//tint-tense/target/tint-tense-0.1.jar
.//tint-textpro/target/tint-textpro-0.1.jar
.//tint-tokenizer/target/tint-tokenizer-0.1-tests.jar
.//tint-tokenizer/target/tint-tokenizer-0.1.jar
I prefer to take the generated jars one by one and put in my classpath. The issue here is that I do not find that util in the maven generated depencies (mvn package / install
).
Run mvn dependency:tree
to print the list of dependencies recursively. As a suggestion, use the corenlp370
branch of Tint, so that you have the last version. In this case, you'll have to fix some dependencies, therefore you should mvn install
utils and fcw before compiling Tint.
Anyway, if you include Tint in an existing Java project I suggest you to use Maven for both and include it into the pom.xml
file. If you need to run Tint from the shell, just run mvn package -Prelease
and uncompress the ready-to-use tar.gz
archive you can find in the tint-runner/target
folder.
@ziorufus Yes that is the best solution, I now realize that there are too much dependencies in the ~/.m2/repository/
folder to copy... Grazie!
@ziorufus So I did a check of corenlp370
and then I did mvn package -Prelease
, but I get an error:
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.3:compile (default-compile) on project tint-runner: Compilation failure: Compilation failure:
[ERROR] /tint/tint-runner/src/main/java/eu/fbk/dh/tint/runner/TintPipeline.java:[6,39] package eu.fbk.utils.corenlp.outputters does not exist
[ERROR] /tint/tint-runner/src/main/java/eu/fbk/dh/tint/runner/TintPipeline.java:[147,44] package eu.fbk.utils.corenlp.outputters does not exist
[ERROR] /tint/tint-runner/src/main/java/eu/fbk/dh/tint/runner/TintPipeline.java:[150,13] cannot find symbol
[ERROR] symbol: variable TextProOutputter
[ERROR] location: class eu.fbk.dh.tint.runner.TintPipeline
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn <goals> -rf :tint-runner
The utils
package compiles and build, while on the dependency fcw
I get
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.3:compile (default-compile) on project fcw-wikipedia: Compilation failure
[ERROR] /tint/fcw/fcw-wikipedia/src/main/java/eu/fbk/fcw/wikipedia/WikipediaCorefAnnotator.java:[143,41] cannot find symbol
[ERROR] symbol: class SimpleCorefAnnotation
[ERROR] location: class eu.fbk.utils.corenlp.CustomAnnotations
[ERROR]
NOTE. I can compile and build the version on the master
branch with any issues.
You are right: before installing utils you need to switch to the develop
branch.
@ziorufus Ciao! I'm not sure about the steps to build tint
, utils
and fcw
, with the latest support for CoreNLP
3.7.0. Could you please guide me through?
Thank you.
@ziorufus maybe a simpler solution could be to provide the releases builds (i.e. packaged with all the dependencies) directly. Thank you.
@ziorufus Ciao, a question about the above configuration for ner
only.
Assumed that the configuration is like annotators=ita_toksent, pos, ita_morpho, ita_lemma, ner
, and I'm not going to use HeidelTime
, do I need this jar only as dependency?
-rw-r--r-- 1 loretoparisi staff 24003 17 Lug 11:39 /root/tint-tokenizer-0.1.jar
If not, could you please point me to the related maven dependencies? At this time I have in my jar files
├── ahocorasick-0.3.0.jar
├── tint-digimorph-0.1.jar
├── tint-digimorph-annotator-0.1.jar
├── tint-heideltime-annotator-0.1.jar
├── tint-tokenizer-0.1.jar
└── utils-core-3.0.jar
and I would like to keep only the ones needed.
Thank you for your help!!!
@loretoparisi The problem is that each dependency as its own dependencies. If you use Maven, the dependency tree is built and managed automatically; if you want to include the jars, you need to resolve the tree and add everything.
Hello @ziorufus
Sorry to write in this thread, but I have a related problem getting Stanford CoreNLP 3.9.1 to work with the Italian models from Tint. I have the following properties configuration file:
annotators = ita_toksent, ner
tokenize.language = it
ssplit.newlineIsSentenceBreak = false
pos.model = models/italian-fast.tagger
depparse.model = models/parser-model-1.txt.gz
ner.model = models/ner-ita-nogpe-noiob_gaz_wikipedia_sloppy.ser.gz
ner.applyNumericClassifiers = false
ner.useSUTime = false
ner.applyFineGrained = false
customAnnotatorClass.ita_toksent = eu.fbk.dh.tint.tokenizer.annotators.ItalianTokenizerAnnotator
customAnnotatorClass.ita_lemma = eu.fbk.dh.tint.digimorph.annotator.DigiLemmaAnnotator
customAnnotatorClass.ita_morpho = eu.fbk.dh.tint.digimorph.annotator.DigiMorphAnnotator
And the error I get is:
19:47:31.113 INFO Registering annotator ita_toksent with class eu.fbk.dh.tint.tokenizer.annotators.ItalianTokenizerAnnotator
19:47:31.114 INFO Registering annotator ita_morpho with class eu.fbk.dh.tint.digimorph.annotator.DigiMorphAnnotator
19:47:31.114 INFO Registering annotator ita_lemma with class eu.fbk.dh.tint.digimorph.annotator.DigiLemmaAnnotator
19:47:31.118 INFO Adding annotator ita_toksent
19:47:31.221 INFO Loaded 37 normalization rules
19:47:31.224 INFO Loaded 7 sentence splitting rules
19:47:31.225 INFO Loaded 6 token splitting rules
19:47:31.226 INFO Loaded 9 regular expressions
19:47:31.240 INFO Loaded 288 abbreviations
19:47:31.253 INFO Adding annotator ner
19:47:34.229 INFO Loading classifier from models/ner-ita-nogpe-noiob_gaz_wikipedia_sloppy.ser.gz ... done [2.9 sec].
Exception in thread "main" java.lang.IllegalArgumentException: annotator "ner" requires annotation "IsNewlineAnnotation". The usual requirements for this annotator are: tokenize,ssplit,pos,lemma
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:504)
Any ideas?
If I add the pos
, ita_morpho
and ita_lemma
annotators, I get a different error:
Caused by: java.lang.ClassNotFoundException: kotlin.TypeCastException
I've added the Tint Maven dependency as instructed in the documentation:
<dependency>
<groupId>eu.fbk.dh</groupId>
<artifactId>tint-runner</artifactId>
<version>0.2</version>
</dependency>
Thank you!
It seems that CoreNLP 3.9.1 added a new mandatory annotation. It is not documented on Stanford NLP website, therefore I needed to write to the group. For now, I patched it, hoping that it is enough. Just pull the develop
branch, recompile using mvn clean install
, edit the version in your POM from 0.2
to 1.0-SNAPSHOT
and try again.
@ziorufus yes there is a important migration to do for the annotators and sub-annotators: https://github.com/stanfordnlp/CoreNLP/issues/633#issuecomment-370109959
I know that, but the problem is not on sub annotators (that Tint is not using), but on a new annotation called IsNewlineAnnotation
, that is required by the NER and it is not documented anywhere.
Thank you @ziorufus Using the change you've done in the develop
branch worked!
Hello @ziorufus ,
I am finding an error while using tint with corenlp 3.9.1
Exception in thread "main" edu.stanford.nlp.util.MetaClass$ClassCreationException: java.lang.ClassNotFoundException: eu.fbk.dh.tint.tokenizer.annotators.ItalianTokenizerAnnotator at edu.stanford.nlp.util.MetaClass.createFactory(MetaClass.java:364) at edu.stanford.nlp.util.MetaClass.createInstance(MetaClass.java:381) at edu.stanford.nlp.pipeline.AnnotatorImplementations.custom(AnnotatorImplementations.java:141) at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$null$67(StanfordCoreNLP.java:606) at edu.stanford.nlp.util.Lazy$3.compute(Lazy.java:126) at edu.stanford.nlp.util.Lazy.get(Lazy.java:31) at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:149) at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:495) at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:201) at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:194) at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:181) at Caused by: java.lang.ClassNotFoundException: eu.fbk.dh.tint.tokenizer.annotators.ItalianTokenizerAnnotator at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) 19:09:48.221 [main] INFO e.s.nlp.pipeline.StanfordCoreNLP - Adding annotator ita_toksent at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at edu.stanford.nlp.util.MetaClass$ClassFactory.construct(MetaClass.java:135) at edu.stanford.nlp.util.MetaClass$ClassFactory.<init>(MetaClass.java:202) at edu.stanford.nlp.util.MetaClass$ClassFactory.<init>(MetaClass.java:69) at edu.stanford.nlp.util.MetaClass.createFactory(MetaClass.java:360) ... 11 more
these are my configs
properties.setProperty("annotators", "ita_toksent, pos, ita_morpho, ita_lemma, ner"); properties.setProperty("ner.model", "models/ner-ita-nogpe-noiob_gaz_wikipedia_sloppy.ser.gz"); properties.setProperty("pos.model","models/italian-fast.tagger"); properties.setProperty("depparse.model","models/parser-model-1.txt.gz"); properties.setProperty("customAnnotatorClass.ita_toksent", "eu.fbk.dh.tint.tokenizer.annotators.ItalianTokenizerAnnotator"); properties.setProperty("customAnnotatorClass.ita_lemma", "eu.fbk.dh.tint.digimorph.annotator.DigiLemmaAnnotator"); properties.setProperty("customAnnotatorClass.ita_morpho","eu.fbk.dh.tint.digimorph.annotator.DigiMorphAnnotator"); properties.setProperty("ssplit.newlineIsSentenceBreak","always"); properties.setProperty("ner.useSUTime","false");
I am using maven dependency of tint as mentioned in docs . Any help would be greatly appreciated..thanks!!
Try to add this dependency to the pom.xml
file.
<dependency>
<groupId>eu.fbk.dh</groupId>
<artifactId>tint-tokenizer</artifactId>
<version>1.0-SNAPSHOT</version>
<scope>runtime</scope>
</dependency>
Use the develop
branch.
Hello @ziorufus,
I have cloned source code and used develop branch and compiled using mvn clean install
. Also I changed my pom version from 0.2 to 1.0-SNAPSHOT but still facing issue
Exception in thread "main" edu.stanford.nlp.util.MetaClass$ClassCreationException: java.lang.ClassNotFoundException: eu.fbk.dh.tint.tokenizer.annotators.ItalianTokenizerAnnotator
Hi @ziorufus, I am trying to use TINT for italian NER as well, following the configuration you mention at the beginning of this thread. I have a couple of questions:
Tint uses HeidelTime standalone because it's hard to integrate it in a flow that does not use UIMA. Tree tagger is required because it uses the correct POS tags. We are working on a custom version of HeidelTime that can be easily integrated into the Tint POS tags, but it's not ready yet.
@ziorufus thank you! Unfortunately, this makes tint a little hard to deploy, since it starts a lot of processes in the background for heideltime and brings a dependency on perl...
Hi @ziorufus, sorry for disturbing you, I am trying to integrate Tint into a Pepper module and am getting the error resource italian.db not found :
Caused by: edu.stanford.nlp.util.MetaClass$ClassCreationException: MetaClass couldn't create public eu.fbk.dh.tint.digimorph.annotator.DigiMorphAnnotator(java.lang.String,java.util.Properties) with args [ita_morpho, {readability.glossario.use=no, timex.uimaVarLanguage=Language, timex.uimaVarDuration=Duration, timex.chineseTokenizerPath=, ner.useSUTime=0, timex.typeSystemHome_DKPro=desc/type/DKPro_TypeSystem.xml, dbps.first_confidence=0.5, timex.uimaVarTime=Time, timex.uimaVarTypeToProcess=Type, dbps.min_confidence=0.3, customAnnotatorClass.keyphrase=eu.fbk.dh.kd.annotator.DigiKdAnnotator, customAnnotatorClass.fake_dep=eu.fbk.dkm.pikes.depparseannotation.StanfordToConllDepsAnnotator, timex.uimaVarDate=Date, dbps.address=http://spotlight.sztaki.hu:2230/rest, customAnnotatorClass.timex=eu.fbk.dh.tint.heideltime.annotator.HeidelTimeAnnotator, customAnnotatorClass.geoloc=eu.fbk.dh.tint.geoloc.annotator.GeolocAnnotator, timex.uimaVarSet=Set, dbps.annotator=dbpedia-annotate, pos.model=models/italian-fast.tagger, dbps.extract_types=0, customAnnotatorClass.readability=eu.fbk.dh.tint.readability.ReadabilityAnnotator, timex.considerTemponym=false, annotators=ita_toksent, pos, ita_morpho, ita_lemma, ner, timex.considerTime=true, timex.treeTaggerHome=/home/nadiushka/treetagger/cmd, timex.considerDate=true, customAnnotatorClass.ita_tense=eu.fbk.dh.tint.tense.TenseAnnotator, readability.glossario.parse=yes, readability.language=it, depparse.model=models/parser-model-1.txt.gz, customAnnotatorClass.dbps=eu.fbk.dkm.pikes.twm.LinkingAnnotator, customAnnotatorClass.ita_morpho=eu.fbk.dh.tint.digimorph.annotator.DigiMorphAnnotator, timex.typeSystemHome=desc/type/HeidelTime_TypeSystem.xml, timex.uimaVarTemponym=Temponym, readability.glossario.stanford.annotators=ita_toksent, pos, ita_morpho, ita_lemma, customAnnotatorClass.ml=eu.fbk.dkm.pikes.twm.LinkingAnnotator, ner.model=models/ner-ita-nogpe-noiob_gaz_wikipedia_sloppy.ser.gz, customAnnotatorClass.ita_toksent=eu.fbk.dh.tint.tokenizer.annotators.ItalianTokenizerAnnotator, timex.considerDuration=true, timex.considerSet=true, customAnnotatorClass.ita_lemma=eu.fbk.dh.tint.digimorph.annotator.DigiLemmaAnnotator}]
at edu.stanford.nlp.util.MetaClass$ClassFactory.createInstance(MetaClass.java:237)
...
at eu.fbk.dh.tint.runner.TintPipeline.runRaw(TintPipeline.java:103)
at CoreNLPPepper.CoreNLPPepper.CoreNLPManipulator$CoreNLPMapper.testStanfordItalian(CoreNLPManipulator.java:238)
at CoreNLPPepper.CoreNLPPepper.CoreNLPManipulator$CoreNLPMapper.mapSDocument(CoreNLPManipulator.java:144)
at org.corpus_tools.pepper.impl.PepperMapperControllerImpl.map(PepperMapperControllerImpl.java:251)
at org.corpus_tools.pepper.impl.PepperMapperControllerImpl.run(PepperMapperControllerImpl.java:188)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at edu.stanford.nlp.util.MetaClass$ClassFactory.createInstance(MetaClass.java:233)
... 16 more
Caused by: java.lang.IllegalArgumentException: resource italian.db not found.
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:146)
at com.google.common.io.Resources.getResource(Resources.java:197)
at eu.fbk.dh.tint.digimorph.DigiMorph.
I included the tint-digimorph jar into my pom.xml and it gets copied to the snapshot and I also copy it to the dependency folder used by Pepper.
Would you have any idea? Thanks a lot!
Try to save the italian.db file somewhere on your computer, and specify it using the property ita_morpho.model You can find the file here: https://github.com/dhfbk/tint/tree/master/tint-digimorph/src/main/resources
I suggest you to use the develop branch.
Best, Alessio
Il giorno mar 19 feb 2019 alle ore 17:13 nadezdaalexandrovna < notifications@github.com> ha scritto:
Hi @ziorufus https://github.com/ziorufus, sorry for disturbing you, I am trying to integrate Tint into a Pepper module and am getting the error resource italian.db not found : Caused by: edu.stanford.nlp.util.MetaClass$ClassCreationException: MetaClass couldn't create public eu.fbk.dh.tint.digimorph.annotator.DigiMorphAnnotator(java.lang.String,java.util.Properties) with args [ita_morpho, {readability.glossario.use=no, timex.uimaVarLanguage=Language, timex.uimaVarDuration=Duration, timex.chineseTokenizerPath=, ner.useSUTime=0, timex.typeSystemHome_DKPro=desc/type/DKPro_TypeSystem.xml, dbps.first_confidence=0.5, timex.uimaVarTime=Time, timex.uimaVarTypeToProcess=Type, dbps.min_confidence=0.3, customAnnotatorClass.keyphrase=eu.fbk.dh.kd.annotator.DigiKdAnnotator, customAnnotatorClass.fake_dep=eu.fbk.dkm.pikes.depparseannotation.StanfordToConllDepsAnnotator, timex.uimaVarDate=Date, dbps.address=http://spotlight.sztaki.hu:2230/rest, customAnnotatorClass.timex=eu.fbk.dh.tint.heideltime.annotator.HeidelTimeAnnotator, customAnnotatorClass.geoloc=eu.fbk.dh.tint.geoloc.annotator.GeolocAnnotator, timex.uimaVarSet=Set, dbps.annotator=dbpedia-annotate, pos.model=models/italian-fast.tagger, dbps.extract_types=0, customAnnotatorClass.readability=eu.fbk.dh.tint.readability.ReadabilityAnnotator, timex.considerTemponym=false, annotators=ita_toksent, pos, ita_morpho, ita_lemma, ner, timex.considerTime=true, timex.treeTaggerHome=/home/nadiushka/treetagger/cmd, timex.considerDate=true, customAnnotatorClass.ita_tense=eu.fbk.dh.tint.tense.TenseAnnotator, readability.glossario.parse=yes, readability.language=it, depparse.model=models/parser-model-1.txt.gz, customAnnotatorClass.dbps=eu.fbk.dkm.pikes.twm.LinkingAnnotator, customAnnotatorClass.ita_morpho=eu.fbk.dh.tint.digimorph.annotator.DigiMorphAnnotator, timex.typeSystemHome=desc/type/HeidelTime_TypeSystem.xml, timex.uimaVarTemponym=Temponym, readability.glossario.stanford.annotators=ita_toksent, pos, ita_morpho, ita_lemma, customAnnotatorClass.ml=eu.fbk.dkm.pikes.twm.LinkingAnnotator, ner.model=models/ner-ita-nogpe-noiob_gaz_wikipedia_sloppy.ser.gz, customAnnotatorClass.ita_toksent=eu.fbk.dh.tint.tokenizer.annotators.ItalianTokenizerAnnotator, timex.considerDuration=true, timex.considerSet=true, customAnnotatorClass.ita_lemma=eu.fbk.dh.tint.digimorph.annotator.DigiLemmaAnnotator}] at edu.stanford.nlp.util.MetaClass$ClassFactory.createInstance(MetaClass.java:237) ... at eu.fbk.dh.tint.runner.TintPipeline.runRaw(TintPipeline.java:103) at CoreNLPPepper.CoreNLPPepper.CoreNLPManipulator$CoreNLPMapper.testStanfordItalian(CoreNLPManipulator.java:238) at CoreNLPPepper.CoreNLPPepper.CoreNLPManipulator$CoreNLPMapper.mapSDocument(CoreNLPManipulator.java:144) at org.corpus_tools.pepper.impl.PepperMapperControllerImpl.map(PepperMapperControllerImpl.java:251) at org.corpus_tools.pepper.impl.PepperMapperControllerImpl.run(PepperMapperControllerImpl.java:188) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at edu.stanford.nlp.util.MetaClass$ClassFactory.createInstance(MetaClass.java:233) ... 16 more Caused by: java.lang.IllegalArgumentException: resource italian.db not found. at com.google.common.base.Preconditions.checkArgument(Preconditions.java:146) at com.google.common.io.Resources.getResource(Resources.java:197) at eu.fbk.dh.tint.digimorph.DigiMorph.(DigiMorph.java:58) at eu.fbk.dh.tint.digimorph.annotator.DigiMorphModel.getInstance(DigiMorphModel.java:14) at eu.fbk.dh.tint.digimorph.annotator.DigiMorphAnnotator.(DigiMorphAnnotator.java:23)
I included the tint-digimorph jar into my pom.xml and it gets copied to the snapshot and I also copy it to the dependency folder used by Pepper.
Would you have any idea? Thanks a lot!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dhfbk/tint/issues/18#issuecomment-465198168, or mute the thread https://github.com/notifications/unsubscribe-auth/ADWtiFpKuvU6JIe1BDTCG2rIPOeHUXAgks5vPCKjgaJpZM4OZyzr .
Thank you very much for your quick reply! It worked and now I have a new error: the beginning is the same, but the end is different:
Caused by: java.lang.NoClassDefFoundError: org/mapdb/volume/MappedFileVol
at eu.fbk.dh.tint.digimorph.DigiMorph.
Maybe you have an idea about this one, too? Thank you!
Try adding this dependency to your pom.xml file
<dependency>
<groupId>org.mapdb</groupId>
<artifactId>mapdb</artifactId>
<version>3.0.1</version>
</dependency>
Best, Alessio
Il giorno mar 19 feb 2019 alle ore 17:31 nadezdaalexandrovna < notifications@github.com> ha scritto:
Thank you very much for your quick reply! It worked and now I have a new error: the beginning is the same, but the end is different: Caused by: java.lang.NoClassDefFoundError: org/mapdb/volume/MappedFileVol at eu.fbk.dh.tint.digimorph.DigiMorph.(DigiMorph.java:67) at eu.fbk.dh.tint.digimorph.annotator.DigiMorphModel.getInstance(DigiMorphModel.java:14) at eu.fbk.dh.tint.digimorph.annotator.DigiMorphAnnotator.(DigiMorphAnnotator.java:23)
Maybe you have an idea about this one, too? Thank you!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dhfbk/tint/issues/18#issuecomment-465206858, or mute the thread https://github.com/notifications/unsubscribe-auth/ADWtiHYBNW4_ls_BG0GLNpd3YBxLe9pcks5vPCbygaJpZM4OZyzr .
Thank you! I am still getting errors, but will continue trying on my own now. Thanks a lot for being so responsive!
Good morning Alessio, Sorry to disturb you, I am trying to use the development version. I have compiled the tint-runner-1.0-SNAPSHOT.jar and the tint-runner-1.0-SNAPSHOT-jar-with-dependencies.jar following the instructions on github. The result is successful and the reactor summary is the following: Reactor Summary: [INFO] [INFO] tint ............................................... SUCCESS [ 1.253 s] [INFO] tint-eval .......................................... SUCCESS [ 1.975 s] [INFO] tint-resources ..................................... SUCCESS [ 5.017 s] [INFO] tint-digimorph ..................................... SUCCESS [ 2.199 s] [INFO] tint-digimorph-annotator ........................... SUCCESS [ 0.380 s] [INFO] tint-tokenizer ..................................... SUCCESS [ 0.278 s] [INFO] tint-verb .......................................... SUCCESS [ 0.583 s] [INFO] tint-readability ................................... SUCCESS [ 1.047 s] [INFO] tint-derived ....................................... SUCCESS [ 0.153 s] [INFO] tint-heideltime-annotator .......................... SUCCESS [ 0.343 s] [INFO] tint-models ........................................ SUCCESS [ 6.418 s] [INFO] tint-runner ........................................ SUCCESS [ 45.478 s] [INFO] tint-inverse-digimorph ............................. SUCCESS [ 1.482 s] [INFO] tint-simplifier .................................... SUCCESS [ 20.939 s]
Now I need to make this jar accessible to my project, so I need to install it into my ./m2 folder. I tried to do it with the following command: mvn --also-make-dependents install:install-file -Dfile=tint-runner/target/tint-runner-1.0-SNAPSHOT.jar -DgroupId=eu.fbk.dh -DartifactId=tint-runner -Dversion=1.0-SNAPSHOT -Dpackaging=jar but the reactor summary was different: Reactor Summary: [INFO] [INFO] tint ............................................... SUCCESS [ 0.290 s] [INFO] tint-eval .......................................... SKIPPED [INFO] tint-resources ..................................... SKIPPED [INFO] tint-digimorph ..................................... SKIPPED [INFO] tint-digimorph-annotator ........................... SKIPPED [INFO] tint-tokenizer ..................................... SKIPPED [INFO] tint-verb .......................................... SKIPPED [INFO] tint-readability ................................... SKIPPED [INFO] tint-derived ....................................... SKIPPED [INFO] tint-heideltime-annotator .......................... SKIPPED [INFO] tint-models ........................................ SKIPPED [INFO] tint-runner ........................................ SKIPPED [INFO] tint-inverse-digimorph ............................. SKIPPED [INFO] tint-simplifier .................................... SKIPPED
How can I make all the modules get installed and not only the first one? Thanks a lot!
Did you try a simple "mvn install"? It should work... A.
Il giorno gio 21 feb 2019 alle ore 10:32 nadezdaalexandrovna < notifications@github.com> ha scritto:
Good morning Alessio, Sorry to disturb you, I am trying to use the development version. I have compiled the tint-runner-1.0-SNAPSHOT.jar and the tint-runner-1.0-SNAPSHOT-jar-with-dependencies.jar following the instructions on github. The result is successful and the reactor summary is the following: Reactor Summary: [INFO] [INFO] tint ............................................... SUCCESS [ 1.253 s] [INFO] tint-eval .......................................... SUCCESS [ 1.975 s] [INFO] tint-resources ..................................... SUCCESS [ 5.017 s] [INFO] tint-digimorph ..................................... SUCCESS [ 2.199 s] [INFO] tint-digimorph-annotator ........................... SUCCESS [ 0.380 s] [INFO] tint-tokenizer ..................................... SUCCESS [ 0.278 s] [INFO] tint-verb .......................................... SUCCESS [ 0.583 s] [INFO] tint-readability ................................... SUCCESS [ 1.047 s] [INFO] tint-derived ....................................... SUCCESS [ 0.153 s] [INFO] tint-heideltime-annotator .......................... SUCCESS [ 0.343 s] [INFO] tint-models ........................................ SUCCESS [ 6.418 s] [INFO] tint-runner ........................................ SUCCESS [ 45.478 s] [INFO] tint-inverse-digimorph ............................. SUCCESS [ 1.482 s] [INFO] tint-simplifier .................................... SUCCESS [ 20.939 s]
Now I need to make this jar accessible to my project, so I need to install it into my ./m2 folder. I tried to do it with the following command: mvn --also-make-dependents install:install-file -Dfile=tint-runner/target/tint-runner-1.0-SNAPSHOT.jar -DgroupId=eu.fbk.dh -DartifactId=tint-runner -Dversion=1.0-SNAPSHOT -Dpackaging=jar but the reactor summary was different: Reactor Summary: [INFO] [INFO] tint ............................................... SUCCESS [ 0.290 s] [INFO] tint-eval .......................................... SKIPPED [INFO] tint-resources ..................................... SKIPPED [INFO] tint-digimorph ..................................... SKIPPED [INFO] tint-digimorph-annotator ........................... SKIPPED [INFO] tint-tokenizer ..................................... SKIPPED [INFO] tint-verb .......................................... SKIPPED [INFO] tint-readability ................................... SKIPPED [INFO] tint-derived ....................................... SKIPPED [INFO] tint-heideltime-annotator .......................... SKIPPED [INFO] tint-models ........................................ SKIPPED [INFO] tint-runner ........................................ SKIPPED [INFO] tint-inverse-digimorph ............................. SKIPPED [INFO] tint-simplifier .................................... SKIPPED
How can I make all the modules get installed and not only the first one? Thanks a lot!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dhfbk/tint/issues/18#issuecomment-465927709, or mute the thread https://github.com/notifications/unsubscribe-auth/ADWtiJV0RQQH2xfZE6HGlIcmfqNXPjWNks5vPmedgaJpZM4OZyzr .
Thanks a lot, it worked. Now I have another resource not found problem:
resource feat-mappings.txt not found:
edu.stanford.nlp.util.MetaClass$ClassCreationException: MetaClass couldn't create public eu.fbk.dh.tint.digimorph.annotator.DigiLemmaAnnotator(java.lang.String,java.util.Properties) with args [ita_lemma, {readability.glossario.use=no, timex.uimaVarLanguage=Language, timex.uimaVarDuration=Duration, timex.chineseTokenizerPath=, ner.useSUTime=0, timex.typeSystemHome_DKPro=desc/type/DKPro_TypeSystem.xml, dbps.first_confidence=0.5, timex.uimaVarTime=Time, timex.uimaVarTypeToProcess=Type, dbps.min_confidence=0.3, customAnnotatorClass.keyphrase=eu.fbk.dh.kd.annotator.DigiKdAnnotator, customAnnotatorClass.fake_dep=eu.fbk.dkm.pikes.depparseannotation.StanfordToConllDepsAnnotator, timex.uimaVarDate=Date, dbps.address=http://spotlight.sztaki.hu:2230/rest, customAnnotatorClass.timex=eu.fbk.dh.tint.heideltime.annotator.HeidelTimeAnnotator, customAnnotatorClass.geoloc=eu.fbk.dh.tint.geoloc.annotator.GeolocAnnotator, timex.uimaVarSet=Set, dbps.annotator=dbpedia-annotate, pos.model=models/italian-fast.tagger, dbps.extract_types=0, customAnnotatorClass.readability=eu.fbk.dh.tint.readability.ReadabilityAnnotator, timex.considerTemponym=false, annotators=ita_toksent, pos, ita_morpho, ita_lemma, ner, timex.considerTime=true, timex.treeTaggerHome=/home/nadiushka/treetagger/cmd, timex.considerDate=true, customAnnotatorClass.ita_tense=eu.fbk.dh.tint.tense.TenseAnnotator, readability.glossario.parse=yes, readability.language=it, depparse.model=models/parser-model-1.txt.gz, customAnnotatorClass.dbps=eu.fbk.dkm.pikes.twm.LinkingAnnotator, customAnnotatorClass.ita_morpho=eu.fbk.dh.tint.digimorph.annotator.DigiMorphAnnotator, timex.typeSystemHome=desc/type/HeidelTime_TypeSystem.xml, timex.uimaVarTemponym=Temponym, readability.glossario.stanford.annotators=ita_toksent, pos, ita_morpho, ita_lemma, customAnnotatorClass.ml=eu.fbk.dkm.pikes.twm.LinkingAnnotator, ner.model=models/ner-ita-nogpe-noiob_gaz_wikipedia_sloppy.ser.gz, customAnnotatorClass.ita_toksent=eu.fbk.dh.tint.tokenizer.annotators.ItalianTokenizerAnnotator, timex.considerDuration=true, timex.considerSet=true, customAnnotatorClass.ita_lemma=eu.fbk.dh.tint.digimorph.annotator.DigiLemmaAnnotator, ita_morpho.model=/home/nadiushka/pepper/CoreNLPPepper/italian.db}]
...
Caused by: java.lang.IllegalArgumentException: resource feat-mappings.txt not found.
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:146)
at com.google.common.io.Resources.getResource(Resources.java:197)
at eu.fbk.dh.tint.digimorph.annotator.GuessModel.
I saved it on my computer, but to what variable in the default-config.properties file should I assign its path? Thank you!
Yes, I guess you can assign che path to the file in the properties file. Best, Alessio
Il giorno gio 21 feb 2019 alle ore 16:39 nadezdaalexandrovna < notifications@github.com> ha scritto:
Thanks a lot, it worked. Now I have another resource not found problem: resource feat-mappings.txt not found: edu.stanford.nlp.util.MetaClass$ClassCreationException: MetaClass couldn't create public eu.fbk.dh.tint.digimorph.annotator.DigiLemmaAnnotator(java.lang.String,java.util.Properties) with args [ita_lemma, {readability.glossario.use=no, timex.uimaVarLanguage=Language, timex.uimaVarDuration=Duration, timex.chineseTokenizerPath=, ner.useSUTime=0, timex.typeSystemHome_DKPro=desc/type/DKPro_TypeSystem.xml, dbps.first_confidence=0.5, timex.uimaVarTime=Time, timex.uimaVarTypeToProcess=Type, dbps.min_confidence=0.3, customAnnotatorClass.keyphrase=eu.fbk.dh.kd.annotator.DigiKdAnnotator, customAnnotatorClass.fake_dep=eu.fbk.dkm.pikes.depparseannotation.StanfordToConllDepsAnnotator, timex.uimaVarDate=Date, dbps.address=http://spotlight.sztaki.hu:2230/rest, customAnnotatorClass.timex=eu.fbk.dh.tint.heideltime.annotator.HeidelTimeAnnotator, customAnnotatorClass.geoloc=eu.fbk.dh.tint.geoloc.annotator.GeolocAnnotator, timex.uimaVarSet=Set, dbps.annotator=dbpedia-annotate, pos.model=models/italian-fast.tagger, dbps.extract_types=0, customAnnotatorClass.readability=eu.fbk.dh.tint.readability.ReadabilityAnnotator, timex.considerTemponym=false, annotators=ita_toksent, pos, ita_morpho, ita_lemma, ner, timex.considerTime=true, timex.treeTaggerHome=/home/nadiushka/treetagger/cmd, timex.considerDate=true, customAnnotatorClass.ita_tense=eu.fbk.dh.tint.tense.TenseAnnotator, readability.glossario.parse=yes, readability.language=it, depparse.model=models/parser-model-1.txt.gz, customAnnotatorClass.dbps=eu.fbk.dkm.pikes.twm.LinkingAnnotator, customAnnotatorClass.ita_morpho=eu.fbk.dh.tint.digimorph.annotator.DigiMorphAnnotator, timex.typeSystemHome=desc/type/HeidelTime_TypeSystem.xml, timex.uimaVarTemponym=Temponym, readability.glossario.stanford.annotators=ita_toksent, pos, ita_morpho, ita_lemma, customAnnotatorClass.ml=eu.fbk.dkm.pikes.twm.LinkingAnnotator, ner.model=models/ner-ita-nogpe-noiob_gaz_wikipedia_sloppy.ser.gz, customAnnotatorClass.ita_toksent=eu.fbk.dh.tint.tokenizer.annotators.ItalianTokenizerAnnotator, timex.considerDuration=true, timex.considerSet=true, customAnnotatorClass.ita_lemma=eu.fbk.dh.tint.digimorph.annotator.DigiLemmaAnnotator, ita_morpho.model=/home/nadiushka/pepper/CoreNLPPepper/italian.db}] ... Caused by: java.lang.IllegalArgumentException: resource feat-mappings.txt not found. at com.google.common.base.Preconditions.checkArgument(Preconditions.java:146) at com.google.common.io.Resources.getResource(Resources.java:197) at eu.fbk.dh.tint.digimorph.annotator.GuessModel.(GuessModel.java:235) at eu.fbk.dh.tint.digimorph.annotator.GuessModelInstance.(GuessModelInstance.java:18) at eu.fbk.dh.tint.digimorph.annotator.GuessModelInstance.getInstance(GuessModelInstance.java:23) at eu.fbk.dh.tint.digimorph.annotator.DigiLemmaAnnotator.(DigiLemmaAnnotator.java:87)
I saved it on my computer, but to what variable in the default-config.properties file should I assign its path? Thank you!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dhfbk/tint/issues/18#issuecomment-466046856, or mute the thread https://github.com/notifications/unsubscribe-auth/ADWtiAdo2lU2xRLNNqb0U7Mlp2IEmHEEks5vPr3LgaJpZM4OZyzr .
Yes, but what is the name of the variable to assign it to?
Try to move that file in the src/main/resources folder of your project. A.
Il giorno ven 22 feb 2019 alle ore 12:55 nadezdaalexandrovna < notifications@github.com> ha scritto:
Yes, but what is the name of the variable to assign it to?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dhfbk/tint/issues/18#issuecomment-466372359, or mute the thread https://github.com/notifications/unsubscribe-auth/ADWtiFNo8VVOerDG6CUOTW1iAWlTO97Lks5vP9qegaJpZM4OZyzr .
You were right, there was no property for the guess model. I've updated the code, you can now use ita_lemma.guess_model and specify the file in the properties file. Just pull the repository on the develop branch.
Best, Alessio
Il giorno ven 22 feb 2019 alle ore 14:40 Alessio Palmero Aprosio < alessio@apnetwork.it> ha scritto:
Try to move that file in the src/main/resources folder of your project. A.
Il giorno ven 22 feb 2019 alle ore 12:55 nadezdaalexandrovna < notifications@github.com> ha scritto:
Yes, but what is the name of the variable to assign it to?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dhfbk/tint/issues/18#issuecomment-466372359, or mute the thread https://github.com/notifications/unsubscribe-auth/ADWtiFNo8VVOerDG6CUOTW1iAWlTO97Lks5vP9qegaJpZM4OZyzr .
Thank you.
Good afternoon Alessio,
Sorry to disturb you again, but after pulling the new development version I am now getting the following error:
Caused by: edu.stanford.nlp.util.MetaClass$ClassCreationException: MetaClass couldn't create public eu.fbk.dh.tint.digimorph.annotator.DigiMorphAnnotator(java.lang.String,java.util.Properties) with args [ita_morpho, {readability.glossario.use=no, timex.uimaVarLanguage=Language, timex.uimaVarDuration=Duration, timex.chineseTokenizerPath=, ner.useSUTime=0, timex.typeSystemHome_DKPro=desc/type/DKPro_TypeSystem.xml, dbps.first_confidence=0.5, timex.uimaVarTime=Time, timex.uimaVarTypeToProcess=Type, ita_lemma.guess_model=/home/nadiushka/pepper/CoreNLPPepper/feat-mappings.txt, dbps.min_confidence=0.3, customAnnotatorClass.keyphrase=eu.fbk.dh.kd.annotator.DigiKdAnnotator, customAnnotatorClass.fake_dep=eu.fbk.dkm.pikes.depparseannotation.StanfordToConllDepsAnnotator, timex.uimaVarDate=Date, dbps.address=http://spotlight.sztaki.hu:2230/rest, customAnnotatorClass.timex=eu.fbk.dh.tint.heideltime.annotator.HeidelTimeAnnotator, customAnnotatorClass.geoloc=eu.fbk.dh.tint.geoloc.annotator.GeolocAnnotator, timex.uimaVarSet=Set, dbps.annotator=dbpedia-annotate, pos.model=models/italian-fast.tagger, dbps.extract_types=0, customAnnotatorClass.readability=eu.fbk.dh.tint.readability.ReadabilityAnnotator, timex.considerTemponym=false, annotators=ita_toksent, pos, ita_morpho, ita_lemma, ner, timex.considerTime=true, timex.treeTaggerHome=/home/nadiushka/treetagger/cmd, customAnnotatorClass.ita_tense=eu.fbk.dh.tint.tense.TenseAnnotator, timex.considerDate=true, readability.glossario.parse=yes, readability.language=it, depparse.model=models/parser-model-1.txt.gz, customAnnotatorClass.dbps=eu.fbk.dkm.pikes.twm.LinkingAnnotator, customAnnotatorClass.ita_morpho=eu.fbk.dh.tint.digimorph.annotator.DigiMorphAnnotator, timex.typeSystemHome=desc/type/HeidelTime_TypeSystem.xml, timex.uimaVarTemponym=Temponym, readability.glossario.stanford.annotators=ita_toksent, pos, ita_morpho, ita_lemma, customAnnotatorClass.ml=eu.fbk.dkm.pikes.twm.LinkingAnnotator, ner.model=models/ner-ita-nogpe-noiob_gaz_wikipedia_sloppy.ser.gz, customAnnotatorClass.ita_toksent=eu.fbk.dh.tint.tokenizer.annotators.ItalianTokenizerAnnotator, timex.considerDuration=true, timex.considerSet=true, customAnnotatorClass.ita_lemma=eu.fbk.dh.tint.digimorph.annotator.DigiLemmaAnnotator, ita_morpho.model=italian.db}]
at edu.stanford.nlp.util.MetaClass$ClassFactory.createInstance(MetaClass.java:237)
at edu.stanford.nlp.util.MetaClass.createInstance(MetaClass.java:382)
at edu.stanford.nlp.pipeline.AnnotatorImplementations.custom(AnnotatorImplementations.java:141)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$null$28(StanfordCoreNLP.java:583)
at edu.stanford.nlp.util.Lazy$3.compute(Lazy.java:126)
at edu.stanford.nlp.util.Lazy.get(Lazy.java:31)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:149)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.
It is similar to the one a had already had with italian.db, but not exactly the same. I have tried saving the italian.db file in different places and tried these 3 configurations: 1 ita_morpho.model=/home/nadiushka/pepper/CoreNLPPepper/italian.db 2 ita_morpho.model=models/italian.db 3 ita_morpho.model=italian.db But none of them has worked. Would you have any suggestions on how to address this problem? Thank you in advance!
First thank you for your great work on the italian language for
CoreNLP
. I'm trying theNER
andPOS
tagger. My simplest configuration for CoreNLP is the following:Entities like
DATE
,LOC
,PER
are being recognized. Part of Speech tags as well. I have seen that there are other annotators likeGeoloc
,HeidelTime
, customizedLemma
, etc.For the given configuration, this is my pipeline output:
so the custom annotators like
ita_lemma
andita_toksent
are registered, but I'm not sure that are actually loaded, instead of default ones.Thank you.