SIMPATICOProject / SimpaticoTAEServer

This is the official repository of the Text Adaptation Engine server of Simpatico.
GNU Lesser General Public License v3.0
7 stars 3 forks source link

Docker Version exits on Timeout Error #8

Closed breningham closed 6 years ago

breningham commented 6 years ago

Hi i have just noticed that the version of TAE that we have running for sheffield periodically exits. According to the logs it seems to crash just after loading Italian simplifier.. Please see the log below.

simpatico@simpatico-machine:~/SimpaticoTAEServer$ docker run -v $(pwd)/docker-data:/app/data --name tae_server simpatico_tae
Loading default properties from tagger ../data/spanish.tagger
Loading default properties from tagger ../data/english.tagger
Loading default properties from tagger ../data/galician.tagger
Loading default properties from tagger ../data/italian.tagger
Reading POS tagger model from ../data/spanish.tagger ... Reading POS tagger model from ../data/italian.tagger ... warning: no language set, no open-class tags specified, and no closed-class tags specified; assuming ALL tags are open class tags
Reading POS tagger model from ../data/english.tagger ... Reading POS tagger model from ../data/galician.tagger ... done [0.7 sec].
done [0.7 sec].
done [0.9 sec].
Using TensorFlow backend.
done [1.6 sec].
/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/weight_boosting.py:29: DeprecationWarning: numpy.core.umath_tests is an internal NumPy module and should not be imported. It will be removed in a future NumPy release.
  from numpy.core.umath_tests import inner1d
Loading resources
Loading English simplifier
Loading Spanish simplifier
Loading Italian simplifier
Traceback (most recent call last):
  File "Run_TCP_Syntactic_Simplifier_Server.py", line 72, in <module>
    ss_eng_it = getItalianSyntacticSimplifier(resources)
  File "Run_TCP_Syntactic_Simplifier_Server.py", line 30, in getItalianSyntacticSimplifier
    stfd_parser = Parser_it(resources["corenlp_dir"], resources["prop_it"])
  File "simpatico_ss/simpatico_ss_it/util.py", line 12, in __init__
    self.corenlp = StanfordCoreNLP(corenlp_dir, memory="4g", properties=properties)
  File "simpatico_ss/corenlp/corenlp.py", line 347, in __init__
    self._spawn_corenlp()
  File "simpatico_ss/corenlp/corenlp.py", line 336, in _spawn_corenlp
    self.corenlp.expect("\nNLP> ")
  File "/usr/local/lib/python2.7/dist-packages/pexpect/spawnbase.py", line 341, in expect
    timeout, searchwindowsize, async_)
  File "/usr/local/lib/python2.7/dist-packages/pexpect/spawnbase.py", line 369, in expect_list
    return exp.expect_loop(timeout)
  File "/usr/local/lib/python2.7/dist-packages/pexpect/expect.py", line 119, in expect_loop
    return self.timeout(e)
  File "/usr/local/lib/python2.7/dist-packages/pexpect/expect.py", line 82, in timeout
    raise TIMEOUT(msg)
pexpect.exceptions.TIMEOUT: Timeout exceeded.
<pexpect.pty_spawn.spawn object at 0x7fb4f76c3450>
command: /usr/bin/java
args: ['/usr/bin/java', '-Xmx4g', '-cp', '/app/data/data/stanford-corenlp-full-2016-10-31/stanford-corenlp-3.7.0-models.jar:/app/data/data/stanford-corenlp-full-2016-10-31/javax.json.jar:/app/data/data/stanford-corenlp-full-2016-10-31/joda-time-2.9-sources.jar:/app/data/data/stanford-corenlp-full-2016-10-31/xom.jar:/app/data/data/stanford-corenlp-full-2016-10-31/tint-runner-1.0-SNAPSHOT-jar-with-dependencies.jar:/app/data/data/stanford-corenlp-full-2016-10-31/stanford-spanish-corenlp-2016-10-31-models.jar:/app/data/data/stanford-corenlp-full-2016-10-31/slf4j-simple.jar:/app/data/data/stanford-corenlp-full-2016-10-31/stanford-corenlp-3.7.0-javadoc.jar:/app/data/data/stanford-corenlp-full-2016-10-31/stanford-corenlp-3.7.0.jar:/app/data/data/stanford-corenlp-full-2016-10-31/xom-1.2.10-src.jar:/app/data/data/stanford-corenlp-full-2016-10-31/slf4j-api.jar:/app/data/data/stanford-corenlp-full-2016-10-31/ejml-0.23.jar:/app/data/data/stanford-corenlp-full-2016-10-31/stanford-corenlp-3.7.0-sources.jar:/app/data/data/stanford-corenlp-full-2016-10-31/jollyday-0.4.9-sources.jar:/app/data/data/stanford-corenlp-full-2016-10-31/joda-time.jar:/app/data/data/stanford-corenlp-full-2016-10-31/javax.json-api-1.0-sources.jar:/app/data/data/stanford-corenlp-full-2016-10-31/jollyday.jar:/app/data/data/stanford-corenlp-full-2016-10-31/protobuf.jar', 'edu.stanford.nlp.pipeline.StanfordCoreNLP', '-props', '/app/data/data/italian.myproperties.properties']
buffer (last 100 chars): ' log4j2 configuration file found. Using default configuration: logging only errors to the console.\r\n'
before (last 100 chars): ' log4j2 configuration file found. Using default configuration: logging only errors to the console.\r\n'
after: <class 'pexpect.exceptions.TIMEOUT'>
match: None
match_index: None
exitstatus: None
flag_eof: False
pid: 336
child_fd: 18
closed: False
timeout: 30
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 8192
ignorecase: False
searchwindowsize: 80
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_re:
    0: re.compile('\nNLP> ')
carolscarton commented 6 years ago

Hi Brendam, I will need to check it later since I have not configured the docker so I don't know what may be the reason for the problem.

For now, since in Sheffield you will not use the Italian or the Spanish servers, can you just comment lines 68, 69, 71 and 72 of Run_TCP_Syntactic_Simplifier_Server.py file? This should make the tool only load the English Syntactic Simplifier Server

Thanks!

breningham commented 6 years ago

That gets it running.. but now i have realised that the DockerFile doesnt load the MainTAE Server, is this by design? how can i get this configured to work with the SAE?

carolscarton commented 6 years ago

@mirkoperillo you uploaded the docker code for this tool, right? Could you please check Brendan request? Many thanks!

mirkoperillo commented 6 years ago

Hi,

My mistake, I forgot to add the instructions to run the Run_TAE_Simplification_Server. You have to add these instructions at the end of docker-entrypoint.sh, here

https://github.com/SIMPATICOProject/SimpaticoTAEServer/blob/7f9daa298a539875052a00c4aa76905b42813179/docker-configs/docker-entrypoint.sh#L12

cd /app/main_TAE_server;
python -u Run_TAE_Simplification_Server.py &

Please @breningham , I know you have a working configuration of this stuff, can you please try these modifications ?

breningham commented 6 years ago

Hi @mirkoperillo

My main problem was that the syntactic server binds only to localhost and therefore it's not available externally (and the ports are not exposed)

Further to increase reliability I have replaced the docker configuration to one with supervisord ... This way when we have a script that stops running for any reason it will automatically restart it.

I will share my config when I get the chance.

breningham commented 6 years ago
FROM ubuntu:18.04

RUN apt-get update && \
    apt-get install -y openjdk-8-jdk &&\
    apt-get install -y python-pip python3-pip &&\
    apt-get install -y supervisor &&\
    apt-get clean

RUN pip install --upgrade pip

RUN mkdir -p /var/log/supervisor

RUN pip install kenlm &&\
    pip install gensim &&\
    pip install nltk==3.2.5 &&\
    pip install sklearn &&\
    pip install keras &&\
    pip install numpy &&\
    pip install h5py &&\
    pip install tensorflow===1.3.0 &&\
    pip install langdetect &&\
    pip install pexpect &&\
    pip install unidecode &&\
    pip install grammar_check
WORKDIR /app
COPY . /app

# fix sources relative to nltk v3.2.5
RUN sed -i -e 's/from nltk.tokenize/from nltk.tokenize.stanford/' syntactic_simplification_server/simpatico_ss/simpatico_ss/simplify.py
RUN sed -i -e 's/from nltk.tokenize/from nltk.tokenize.stanford/' syntactic_simplification_server/simpatico_ss/simpatico_ss_es/simplify.py
RUN sed -i -e 's/from nltk.tokenize/from nltk.tokenize.stanford/' syntactic_simplification_server/simpatico_ss/simpatico_ss_gl/simplify.py
RUN sed -i -e 's/from nltk.tokenize/from nltk.tokenize.stanford/' syntactic_simplification_server/simpatico_ss/simpatico_ss_it/simplify.py

# copy resource file
COPY ./docker-configs/resources.txt /app/resources.txt
COPY ./docker-configs/supervisord.conf /etc/supervisor/conf.d/supervisord.conf

EXPOSE 8080 2020 3030 4040 5050 1414 1515

CMD ["/usr/bin/supervisord"]

And the contents on supervisord.conf :

[supervisord]
;logfile=/var/app/logs/ ; (main log file;default $CWD/supervisord.log)
logfile_maxbytes=50MB        ; (max main logfile bytes b4 rotation;default 50MB)
logfile_backups=5           ; (num of main logfile rotation backups;default 10)
loglevel=debug                ; (log level;default info; others: debug,warn,trace)
pidfile=/tmp/supervisord.pid ; (supervisord pidfile;default supervisord.pid)
nodaemon=true                ; (start in foreground if true;default false)
minfds=1024                  ; (min. avail startup file descriptors;default 1024)
minprocs=200                 ; (min. avail process descriptors;default 200)

[program:englishPostTagger]
priority=5
directory=/app/data/stanford-postagger-full-2015-04-20
command=java -mx2G -cp "*:lib/*:models/*" edu.stanford.nlp.tagger.maxent.MaxentTaggerServer -model ../data/english.tagger -port 2020
user=root
autostart=true
autorestart=true

[program:lexicalServer]
priority=10
directory=/app/lexical_simplification_server
command=python -u Run_TCP_Lexical_Simplifier_Server.py
user=root
autostart=true
autorestart=true

[program:syntacticalServer]
priority=11
directory=/app/syntactic_simplification_server
command=python -u Run_TCP_Syntactic_Simplifier_Server.py
user=root
autostart=true
autorestart=true

[program:mainTaeServer]
priority=12
directory=/app/main_TAE_server
command=python -u Run_TAE_Simplification_Server.py
user=root
autostart=true
autorestart=true

Also, it's worth noting that the Timeout error still occurs, but because of my modification, if/when it happens it auto restarts and bam! it works again.

mirkoperillo commented 6 years ago

I can take @breningham configurations: so his version of Dockerfile and commit to the master of the repo. Actually I don't have the machine resources to test this stuff so I assume that it works.

mirkoperillo commented 6 years ago

I pushed to repo the @breningham patch