adobe / NLP-Cube

Natural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing
http://opensource.adobe.com/NLP-Cube/index.html
Apache License 2.0
550 stars 93 forks source link

Getting stuck at "Configuring tzdata" #119

Closed Spaskich closed 3 years ago

Spaskich commented 3 years ago

Describe the bug I'm trying to build a docker image and use the NLP as a web service but it gets stuck on the "Configuring tzdata" phase where I have to choose my geographic area. The problem is that the console is unresponsive and I can't choose any of the given options. I've tried the whole process on multiple Windows machines running Docker for Windows and I always get the same result.

To Reproduce Steps to reproduce the behavior:

  1. Clone the repository
  2. Got to 'repo-dir/docker'
  3. Execute docker build --tag nlp-cube:1.0 .
  4. Wait until it reaches the 'Configuring tzdata' phase

Expected behavior I want to run the NLP in a docker container and use it as a web API for sentence splitting, tokenization, lemmatization, etc. According to the documentation, I should be able to do this by starting the server and accessing container:port/nlp?lang=en&text=test.

Screenshots image

Desktop (please complete the following information):

tiberiu44 commented 3 years ago

This seems to be docker related image issue. I'm currently away from my laptop and canot fix this. However, I think it can be easily solved by adding ENV DEBIAN_FRONTEND noninteractive right on the second line, in the Dockerfile. Could you please check if this works?

Spaskich commented 3 years ago

Yes, this helped, however there is another problem down the line. At the 'Cloning into dynet' phase it throws a 404 error because the eigen repository is no longer on Bitbucket. I tried changing it to the official repository on GitLab but then it says: Cloning into 'dynet'... abort: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop. The last 30x error message was: Found

I'm guessing the changeset -r b2e267d needs to be changed too, but I can't seem to find the equivalent on GitLab.

tiberiu44 commented 3 years ago

Sorry for the delayed response. You could try using the pip dynet package. Just replace the whole DyNet installation with RUN pip install dynet I don't guarantee that the package will include support for Intel's MKL, but at least it will allow you to build the docker image.

Spaskich commented 3 years ago

Thank you very much for your help! I manage to build it and run it successfully. I added RUN pip3 install dynet and had to install Flask and bs4 too. This is the final version of my Dockerfile if somebody needs it for future reference:

FROM ubuntu
ENV DEBIAN_FRONTEND noninteractive

# Installing build dependencies
RUN apt-get update && apt-get install -y build-essential automake make cmake g++ wget git mercurial python3-pip curl

# Preparing Python build environment
RUN pip3 install cython future scipy nltk requests xmltodict nose2

# Installing MKL library
RUN wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB && \
    apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB && \
    wget https://apt.repos.intel.com/setup/intelproducts.list -O /etc/apt/sources.list.d/intelproducts.list && \
    apt-get update && \
    apt-get install -y intel-mkl-64bit-2018.2-046

# Installing DyNET
RUN pip3 install dynet

# Prepare environment UTF-8
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y locales
RUN sed -i -e 's/# en_US.UTF-8 UTF-8/en_US.UTF-8 UTF-8/' /etc/locale.gen && \
    locale-gen
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8

# Prepare cube

RUN mkdir /work && \
    cd /work && \
    git clone https://github.com/adobe/NLP-Cube.git

# Prepare notebook
RUN pip3 install jupyter
RUN pip3 install Flask
RUN pip3 install bs4

# Start notebook
CMD cd /work/NLP-Cube/cube/ && python3 webserver.py --port 8080 --lang=en --lang=fr --lang=de --lang=sk
tiberiu44 commented 3 years ago

@Spaskich - glad to hear you got it working and thank you for sharing your solution. If it's not too much trouble, could you do a pull request with the fix? We can do the copy-paste from our side, but it's your contribution and if we do it instead of you, it will not get reflected on GIT.

Thanks again, Tibi