JaidedAI / EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
https://www.jaided.ai
Apache License 2.0
23.96k stars 3.13k forks source link

What is the recomended python version for the 1.1.8 version? #250

Closed rodrigopinto closed 4 years ago

rodrigopinto commented 4 years ago

Hi there, I am currently writing a Dockerfile for running the project and facing a lot of failures with the versions of the lib dependeny while installing, as pythorch, torchvision and lately numpy.

I have tried python 2 and 3 but no success. So which python version are you considering for running the lib and respective dependencies?

Example of Dockerfile:

FROM python:3.8-alpine

# dependecies for pythorch thorchvision numpy
RUN apk --update-cache add \
    cmake \
    gcc \
    gfortran \
    build-base \
    freetype-dev \
    libpng-dev \
    openblas-dev

RUN pip install easyocr

WORKDIR /srv

COPY . ./srv

CMD ["/bin/bash"]

[UPDATE]

I also used same Dockerfile used in this project to avoid issue with the dependencies and native libs. On the last step to check if there is no erro on the import it fails se below:

Step 8/8 : RUN python -c "import easyocr; reader = easyocr.Reader(${language_models}, gpu=False)"
 ---> Running in 02b717f7d641
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/EasyOCR/easyocr/__init__.py", line 1, in <module>
    from .easyocr import Reader
  File "/home/EasyOCR/easyocr/easyocr.py", line 3, in <module>
    from .detection import get_detector, get_textbox
  File "/home/EasyOCR/easyocr/detection.py", line 7, in <module>
    import cv2
  File "/opt/conda/lib/python3.7/site-packages/cv2/__init__.py", line 5, in <module>
    from .cv2 import *
ImportError: libGL.so.1: cannot open shared object file: No such file or directory
ERROR: Service 'ocr' failed to build: The command '/bin/sh -c python -c "import easyocr; reader = easyocr.Reader(${language_models}, gpu=False)"' returned a non-zero code: 1

Am I missing anything?

rodrigopinto commented 4 years ago

Solved adding libglu1-mesa-dev \ to the dependencies install.

Dockerfile looks like this now:

FROM pytorch/pytorch

# if you forked EasyOCR, you can pass in your own GitHub username to use your fork
# i.e. gh_username=myname
ARG gh_username=JaidedAI
ARG language_models="['pt','en']"
ARG service_home="/home/EasyOCR"

# Configure apt and install packages
RUN apt-get update -y && \
    apt-get install -y \
    libglib2.0-0 \
    libsm6 \
    libxext6 \
    libxrender-dev \
    git \
    libglu1-mesa-dev \ # this line added 
    # cleanup
    && apt-get autoremove -y \
    && apt-get clean -y \
    && rm -rf /var/lib/apt/li

# Clone EasyOCR repo
RUN mkdir "$service_home" \
    && git clone "https://github.com/$gh_username/EasyOCR.git" "$service_home" \
    && cd "$service_home" \
    && git remote add upstream "https://github.com/JaidedAI/EasyOCR.git" \
    && git pull upstream master

# Build C extensions and pandas
RUN cd "$service_home" \
    && python setup.py build_ext --inplace -j 4 \
    && python -m pip install -e .

# Downloads models into container stored inside the ~/.EasyOCR/model directory'
# >> Also implicitly checks no errors on import
RUN python -c "import easyocr; reader = easyocr.Reader(${language_models}, gpu=False)"
rodrigopinto commented 4 years ago

I simplified the Dockerfile to build a image of easyocr based on the stable version of the lib. It makes sense to publish on the dockerhub so anyone can start working without friction. Wdyt? /cc @rkcosmos @ghandic

FROM pytorch/pytorch

# Configure apt and install packages dependencies
RUN apt-get update -y && \
    apt-get install -y \
    libglib2.0-0 \
    libsm6 \
    libxext6 \
    libxrender-dev \
    git \
    libglu1-mesa-dev \
    # cleanup
    && apt-get autoremove -y \
    && apt-get clean -y \
    && rm -rf /var/lib/apt/li

# Latest stable version of the easyocr
RUN pip install easyocr

# Expliciting the same workdir exposed by pytorch image
WORKDIR /workspace

# Implicitly check no errors on import, verifies if everything was installed properly
RUN python -c "import easyocr"
ghandic commented 4 years ago

It is published on Docker hub under challisa/easyocr it would be better to be done by repo owner as I cannot set up auto rebuild on tags or push to master etc.

CI/CD is a must at some point in the roadmap to ensure stability and stop these kind of issues arising.