ocropus-archive / ocropus4-old

27 stars 13 forks source link

Update float type assertion in utils.py #4

Open hamima-kensho opened 3 years ago

hamima-kensho commented 3 years ago

Issue

Floating-point image arrays in my machine have dtype np.float32, not float. This fails the dtype checks in function ocropus/utils.py, line 194 function autoinvert. I'm using numpy version 1.20.3 (more package info below) and adding a type check of image.dtype == np.float32 for the first case seems to yield the proper results. I'm using a Mac, Python 3.8.8, and am otherwise using the packages as specified in ./run venv and pip install -r requirements.txt.

Suggested solution

I don't have an ideal solution here--I'm sure that image.dtype == float might be sufficient in other systems, so we might not want to remove it. if (image.dtype == float) or (image.dtype == np.float32): is clunky but a potentially easy patch. Pinning the package versions for which float is always the expected dtype is a more robust solution.

Anyway, if I'm using this code wrong (always a possibility) I'm fine for this issue to be closed.

My package versions:

(venv) (base) ➜  g1000test git:(main) ✗ pip freeze 
ansiwrap==0.8.4
anyio==3.1.0
appdirs==1.4.4
appnope==0.1.2
argon2-cffi==20.1.0
async-generator==1.10
attrs==21.2.0
autopep8==1.5.7
Babel==2.9.1
backcall==0.2.0
bash-kernel==0.7.2
beautifulsoup4==4.9.3
black==21.5b1
bleach==3.3.0
braceexpand==0.1.7
bs4==0.0.1
certifi==2020.12.5
cffi==1.14.5
chardet==4.0.0
click==7.1.1
coverage==5.5
cycler==0.10.0
decorator==4.4.2
defusedxml==0.7.1
editdistance==0.5.3
entrypoints==0.3
fasteners==0.16
future==0.18.2
greenlet==1.1.0
humanhash3==0.0.6
idna==2.10
imageio==2.9.0
iniconfig==1.1.1
ipykernel==5.5.5
ipython==7.23.1
ipython-genutils==0.2.0
isort==5.8.0
jedi==0.18.0
Jinja2==3.0.1
json5==0.9.5
jsonschema==3.2.0
jupyter-client==6.1.12
jupyter-core==4.7.1
jupyter-server==1.8.0
jupyterlab==3.0.16
jupyterlab-pygments==0.1.2
jupyterlab-server==2.5.2
jupyterlab-sos==0.8.1
kiwisolver==1.3.1
lxml==4.6.3
MarkupSafe==2.0.1
matplotlib==3.4.2
matplotlib-inline==0.1.2
mistune==0.8.4
msgpack==1.0.2
mypy==0.812
mypy-extensions==0.4.3
nbclassic==0.3.1
nbclient==0.5.3
nbconvert==6.0.7
nbformat==5.1.3
neovim==0.3.1
nest-asyncio==1.5.1
networkx==2.5.1
notebook==6.4.0
numpy==1.20.3
-e git://github.com/NVlabs/ocrodeg.git@21109cb4ea0ff90306658e904a3a7b36c1e4f6b7#egg=ocrodeg
packaging==20.9
pandas==1.2.4
pandocfilters==1.4.3
papermill==2.3.3
parso==0.8.2
pathspec==0.8.1
pexpect==4.8.0
pickleshare==0.7.5
Pillow==8.2.0
pluggy==0.13.1
prometheus-client==0.10.1
prompt-toolkit==3.0.18
psutil==5.8.0
ptyprocess==0.7.0
py==1.10.0
pycodestyle==2.7.0
pycparser==2.20
pydocstyle==6.1.1
pydot==1.4.2
pydotplus==2.0.2
Pygments==2.9.0
pynvim==0.4.3
pyparsing==2.4.7
pyrsistent==0.17.3
pytest==6.2.4
pytest-cov==2.12.0
python-dateutil==2.8.1
pytz==2021.1
PyWavelets==1.1.1
PyYAML==5.4.1
pyzmq==22.0.3
regex==2021.4.4
requests==2.25.1
scikit-image==0.18.1
scipy==1.6.3
Send2Trash==1.5.0
simplejson==3.17.2
six==1.16.0
sniffio==1.2.0
snowballstemmer==2.1.0
sos==0.22.5
sos-notebook==0.22.4
sos-papermill==0.2.1
sos-python==0.18.4
soupsieve==2.2.1
tabulate==0.8.9
-e git://github.com/tmbdev/tarproc.git@b233aee5ce654f970ba301bcfdc24147bc4eebc1#egg=tarproc
tenacity==7.0.0
-e git://github.com/NVlabs/tensorcom.git@52fc7c3a4e71e1ab2b6508be4c133cd1ac8a50cd#egg=tensorcom
terminado==0.10.0
testpath==0.5.0
textwrap3==0.9.2
tifffile==2021.4.8
tk==0.1.0
toml==0.10.2
torch==1.8.1
-e git://github.com/tmbdev/torchmore.git@395d9b34a8d4251863fc83f25b9ac4616195d1bc#egg=torchmore
torchvision==0.9.1
tornado==6.1
tqdm==4.61.0
traitlets==5.0.5
-e git://github.com/vatlab/transient-display-data@52047ace8be7f4e073427b023fc40886b932dfef#egg=transient_display_data
typed-ast==1.4.3
typer==0.3.2
typing-extensions==3.10.0.0
urllib3==1.26.4
wcwidth==0.2.5
-e git://github.com/tmbdev/webdataset.git@315977952b74a87848983518c64c9ad43e66c71f#egg=webdataset
webencodings==0.5.1
websocket-client==1.0.1
xxhash==2.0.2
tmbdev commented 3 years ago

Float and np.float32 have been unified in the latest versions of NumPy; maybe you can update?