microsoft / rat-sql

A relation-aware semantic parsing model from English to SQL
https://arxiv.org/abs/1911.04942
MIT License
404 stars 117 forks source link

Docker error #23

Closed Akshaysharma29 closed 3 years ago

Akshaysharma29 commented 3 years ago

I have successfully created a docker image for the 1st time but after deleting the image and again building the image got stuck at

Step 7/14 : RUN python -c "from transformers import BertModel; BertModel.from_pretrained('bert-large-uncased-whole-word-masking')"
 ---> Running in deca10b47ffe
To use data.metrics please install scikit-learn. See https://scikit-learn.org/stable/index.html

and after some, it is giving below error

OSError: Couldn't reach server at 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-pytorch_model.bin' to download pretrained weights.

Can some suggestion on how to solve this issue

Edit

For removing the docker image I have used the below command sudo docker image prune -a

DevanshChoubey commented 3 years ago

Hi @Akshaysharma29

could you try loading bert from your local python interpreter and check if it's downloading BERT.

Akshaysharma29 commented 3 years ago

Hi, @DevanshChoubey yes, I have checked that and it is working. The command I have used is python -c "from transformers import BertModel; BertModel.from_pretrained('bert-large-uncased-whole-word-masking')"

Command-line output:

Downloading: 100%|█████████████████████████████████████████████████████████████████████████| 434/434 [00:00<00:00, 433kB/s]
Downloading: 100%|████████████████████████████████████████████████████████████████████| 1.35G/1.35G [01:37<00:00, 13.8MB/s]
Akshaysharma29 commented 3 years ago

I don't know How but changing it little work for me shifted "# Cache the pretrained BERT model" after "# Download & cache StanfordNLP"

FROM pytorch/pytorch:1.5-cuda10.1-cudnn7-devel

ENV LC_ALL=C.UTF-8 \
    LANG=C.UTF-8

RUN mkdir -p /usr/share/man/man1 && \
    apt-get update && apt-get install -y \
    build-essential \
    cifs-utils \
    curl \
    default-jdk \
    dialog \
    dos2unix \
    git \
    sudo

# Install app requirements first to avoid invalidating the cache
COPY requirements.txt setup.py /app/
WORKDIR /app
RUN pip install --user -r requirements.txt --no-warn-script-location && \
    pip install --user entmax && \
    python -c "import nltk; nltk.download('stopwords'); nltk.download('punkt')"

# Cache the pretrained BERT model
#RUN python -c "from transformers import BertModel; BertModel.from_pretrained('bert-large-uncased-whole-word-masking')"

# Download & cache StanfordNLP
RUN mkdir -p /app/third_party && \
    cd /app/third_party && \
    curl https://nlp.stanford.edu/software/stanford-corenlp-full-2018-10-05.zip | jar xv

# Cache the pretrained BERT model
RUN python -c "from transformers import BertModel; BertModel.from_pretrained('bert-large-uncased-whole-word-masking')"

# Now copy the rest of the app
COPY . /app/

# Assume that the datasets will be mounted as a volume into /mnt/data on startup.
# Symlink the data subdirectory to that volume.
ENV CACHE_DIR=/mnt/data
RUN mkdir -p /mnt/data && \
    mkdir -p /app/data && \
    cd /app/data && \
    ln -snf /mnt/data/spider spider && \
    ln -snf /mnt/data/wikisql wikisql

# Convert all shell scripts to Unix line endings, if any
RUN /bin/bash -c 'if compgen -G "/app/**/*.sh" > /dev/null; then dos2unix /app/**/*.sh; fi'

# Extend PYTHONPATH to load WikiSQL dependencies
ENV PYTHONPATH="/app/third_party/wikisql/:${PYTHONPATH}" 

ENTRYPOINT bash