ocropus-archive / DUP-ocropy

Python-based tools for document analysis and OCR
Apache License 2.0
3.42k stars 592 forks source link

AssertionError: you must install and use OCRopus with Python version 2.7 or later, but not Python 3.x #334

Open hiyamgh opened 4 years ago

hiyamgh commented 4 years ago

I have used Python 2.7 virtual environment for installing requirements.txt But I got the following:

DEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support
C:\Users\User\venv\ocropus\lib\site-packages\pip\_vendor\urllib3\util\ssl_.py:380: SNIMissingWarning: An HTTPS request has been made, but the SNI (Server Name Indication) extension to TLS is not available on this platform. This may cause the server to present an incorrect TLS certificate, which can cause validation failures. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  SNIMissingWarning,
C:\Users\User\venv\ocropus\lib\site-packages\pip\_vendor\urllib3\util\ssl_.py:139: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning,
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '_ssl.c:499: error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version'),)': /simple/numpy/
C:\Users\User\venv\ocropus\lib\site-packages\pip\_vendor\urllib3\util\ssl_.py:139: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning,
WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '_ssl.c:499: error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version'),)': /simple/numpy/
C:\Users\User\venv\ocropus\lib\site-packages\pip\_vendor\urllib3\util\ssl_.py:139: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning,
WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '_ssl.c:499: error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version'),)': /simple/numpy/
C:\Users\User\venv\ocropus\lib\site-packages\pip\_vendor\urllib3\util\ssl_.py:139: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning,
WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '_ssl.c:499: error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version'),)': /simple/numpy/
C:\Users\User\venv\ocropus\lib\site-packages\pip\_vendor\urllib3\util\ssl_.py:139: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning,
WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, '_ssl.c:499: error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version'),)': /simple/numpy/
Could not fetch URL https://pypi.org/simple/numpy/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/numpy/ (Caused by SSLError(SSLError(1, '_ssl.c:499: error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version'),)) - skipping
C:\Users\User\venv\ocropus\lib\site-packages\pip\_vendor\urllib3\util\ssl_.py:139: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning,
ERROR: Could not find a version that satisfies the requirement numpy (from versions: none)
ERROR: No matching distribution found for numpy
Could not fetch URL https://pypi.org/simple/pip/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/pip/ (Caused by SSLError(SSLError(1, '_ssl.c:499: error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version'),)) - skipping
C:\Users\User\venv\ocropus\lib\site-packages\pip\_vendor\urllib3\util\ssl_.py:139: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning,
(ocropus)

So I ended up using Pythn 3.6.

I was able with python 3.6 to install requirements.txt but when I try now to install setu.py I get the following error:

AssertionError: you must install and use OCRopus with Python version 2.7 or later, but not Python 3.x

So how can I use python 2.7 and I'm not able to install the requirements.txt for it ?

amitdo commented 4 years ago

The problem is that ocropy needs Python 2.7 but numpy and scipy newest versions dropped Python 2.7 support.

@zuphilip,

Maybe you can add the needed changes for 3.6+ support from: https://github.com/kba/ocropy/commits/master

hiyamgh commented 4 years ago

@amitdo So now what I need to do is download older versions of numpy and scipy to work ?

amitdo commented 4 years ago

I don't know. I don't think it's a good idea to use Python 2.7 anymore.

Let's wait for @zuphilip response.

hiyamgh commented 4 years ago

For anyone who stumbles upon this in the future, as @amitdo said ocropy needs Python 2.7 but numpy and scipy newest versions dropped Python 2.7 support. However, you can still use them but the problem is that our developer folks will stop supporting bug fixes.

My problem was mainly the global version of Python I have on my system (windows) is 3.6.7 while ocropus requires 2.7. I downloaded a broken executable for Python 2.7 that did not have the Scripts folder installed, thus pip ended up not working at all.

Followed this tutorial that shows how to install another version of python (2.7) and the problem was solved when I downloaded the following executable

In order to use thevirtualenv properly with python 2.7 use the following command

# dont forget to add python.exe 
virtualenv -p C:\Python27\python.exe venv/ocropus

I will not close this issue now, will be waiting for @zuphilip to see if support will be added for newer python versions because its better off.

Goodluck.

michaelsjackson commented 4 years ago

or in ubuntu 18.04, install tesseract, then gImageReader with a nice gui. One day, when this works with Python 3.x I can retest this again, thanks, have fun reading.

kba commented 4 years ago

As for Python 3.x support: There have been a few efforts, including a PR from last year that wasn't ultimately merged.

The best ocropy variant is currently @bertsky's fork that is part of @cisocrgroup's ocrd_cis. This fork incorporates not only the python3 compatibililty but various other improvements. This is the version that I want to merge into upstream as soon as time and projects' progress permit.

michaelsjackson commented 4 years ago

Thanks for the hint, what would be its advantages compared to tesseract + gImageReader for example? Do you have a few interesting cases? Thanks in advance.