dhfbk / tint

The Italian NLP Tool
http://tint.fbk.eu
GNU General Public License v3.0
70 stars 9 forks source link

Packaging Tint Service in Docker #35

Open Myrmex opened 3 years ago

Myrmex commented 3 years ago

It would be nice to have a ready-to-run Docker image with the Tint server.

If you like the idea, you can follow along as I've made a quick test but maybe I'm just missing something obvious.

The dockerfile is:

FROM anapsix/alpine-java
EXPOSE 8012
COPY ./tint /tint/
ENTRYPOINT ["/bin/sh", "/tint/tint-server.sh"]

I uncompressed the downloaded Tint package under tint in the folder where I put the Dockerfile.

I then build it like docker build . -t tint, and then run like docker container run tint -p 50498:8012 (here I remap 8012 to my host 50498).

The server seems to start, even though there are some binding issues at the very beginning of the startup process. Yet, when I try a request like http://localhost:50498/tint?text=prova%20di%20annotazione, I just get a connection refused error. Could anyone help me exclude this is an error on the Tint side?

Here is the server log.

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/tint/lib/slf4j-simple-1.7.21.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/tint/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.SimpleLoggerFactory]
[main] INFO eu.fbk.dh.tint.runner.TintServer - starting 0.0.0.0 8012 (Mon Aug 30 19:17:54 GMT 2021)...
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator keyphrase with class eu.fbk.dh.kd.annotator.DigiKDAnnotator
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator ita_verb with class eu.fbk.dh.tint.verb.VerbAnnotator
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator fake_dep with class eu.fbk.dkm.pikes.depparseannotation.StanfordToConllDepsAnnotator
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator ita_semafor with class eu.fbk.fcw.semafortranslate.SemaforTranslateAnnotator
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator geoloc with class eu.fbk.dh.tint.geoloc.annotator.GeolocAnnotator
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator ita_upos with class eu.fbk.fcw.pos.UPosAnnotator
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator stem with class eu.fbk.fcw.stemmer.corenlp.StemAnnotator
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator ita_derivation with class eu.fbk.dh.tint.derived.DerivationAnnotator
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator readability with class eu.fbk.dh.tint.readability.ReadabilityAnnotator
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator heideltime with class eu.fbk.dh.tint.heideltime.annotator.HeidelTimeAnnotator
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator dbps with class eu.fbk.dkm.pikes.twm.LinkingAnnotator
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator ita_morpho with class eu.fbk.dh.tint.digimorph.annotator.DigiMorphAnnotator
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator ita_splitter with class eu.fbk.dh.tint.splitter.SplitterAnnotator
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator ml with class eu.fbk.dkm.pikes.twm.LinkingAnnotator
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator ita_toksent with class eu.fbk.dh.tint.tokenizer.annotators.ItalianTokenizerAnnotator
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator ita_lemma with class eu.fbk.dh.tint.digimorph.annotator.DigiLemmaAnnotator
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ita_toksent
[main] INFO eu.fbk.dh.tint.tokenizer.ItalianTokenizer - Loaded 37 normalization rules
[main] INFO eu.fbk.dh.tint.tokenizer.ItalianTokenizer - Loaded 5 sentence splitting rules
[main] INFO eu.fbk.dh.tint.tokenizer.ItalianTokenizer - Loaded 2 newline chars
[main] INFO eu.fbk.dh.tint.tokenizer.ItalianTokenizer - Loaded 6 token splitting rules
[main] INFO eu.fbk.dh.tint.tokenizer.ItalianTokenizer - Loaded 15 regular expressions
[main] INFO eu.fbk.dh.tint.tokenizer.ItalianTokenizer - Loaded 293 abbreviations
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos
[main] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - warning: no language set, no open-class tags specified, and no closed-class tags specified; assuming ALL tags are open class tags
[main] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from models/italian-big.tagger ... done [0.3 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ita_upos
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ita_splitter
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ita_morpho
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ita_lemma
[main] INFO eu.fbk.dh.tint.digimorph.annotator.GuessModelInstance - Loading guess model for lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
[main] INFO edu.stanford.nlp.sequences.SeqClassifierFlags - sutime.binders=0
[main] INFO edu.stanford.nlp.sequences.SeqClassifierFlags - Unknown property: |sutime.binders|
[main] INFO edu.stanford.nlp.sequences.SeqClassifierFlags - sutime.rules=models/sutime/lists.sutime.italian.rules,models/sutime/sutime.italian.rules,models/sutime/defs.sutime.txt
[main] INFO edu.stanford.nlp.sequences.SeqClassifierFlags - Unknown property: |sutime.rules|
[main] INFO edu.stanford.nlp.sequences.SeqClassifierFlags - Unknown property: |sutime.binders|
[main] INFO edu.stanford.nlp.sequences.SeqClassifierFlags - Unknown property: |sutime.rules|
[main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from models/italian-ner-wikinews.ser.gz ... done [0.8 sec].
[main] INFO edu.stanford.nlp.time.TimeExpressionExtractorImpl - Using following SUTime rules: models/sutime/lists.sutime.italian.rules,models/sutime/sutime.italian.rules,models/sutime/defs.sutime.txt
[main] INFO edu.stanford.nlp.ling.tokensregex.types.Expressions - Unknown variable: tokens
[main] INFO edu.stanford.nlp.ling.tokensregex.types.Expressions - Unknown variable: tokens
[main] INFO edu.stanford.nlp.pipeline.NERCombinerAnnotator - numeric classifiers: true; SUTime: true DocDateAnnotator[presentDate=2021-08-30]; fine grained: false
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator depparse
[main] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Loaded TreebankLanguagePack: edu.stanford.nlp.trees.PennTreebankLanguagePack
[main] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Loading depparse model: models/italian-depparse-split.txt.gz ... Time elapsed: 5.5 sec
[main] INFO edu.stanford.nlp.parser.nndep.Classifier - PreComputed 20000 vectors, elapsed Time: 0.765 sec
[main] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Initializing dependency parser ... done [6.3 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ita_verb
[main] INFO eu.fbk.dh.tint.runner.TintServer - Pipeline loaded
Aug 30, 2021 7:18:05 PM org.glassfish.grizzly.http.server.NetworkListener start
INFO: Started listener bound to [0.0.0.0:8012]
Aug 30, 2021 7:18:05 PM org.glassfish.grizzly.http.server.HttpServer start
INFO: [HttpServer] Started.