TF setup broken - Githubissues

ocr-d-modul-2-segmentierung / page-segmentation

Pixel classifier for historic prints. Mirror of https://gitlab2.informatik.uni-wuerzburg.de/ls6/ocr4all-pixel-classifier. Report issues here.

Other

5 stars 2 forks source link

TF setup broken #4

Closed bertsky closed 3 years ago

bertsky commented 3 years ago

The version currently published on PyPI (0.6.5) pulls in Tensorflow 2.5, which conflicts with the allowed range in the setup:

https://github.com/ocr-d-modul-2-segmentierung/page-segmentation/blob/e60677447d7ebe58c49ddb2e8a5ac242574d7c53/setup.py#L18-L21

Thus, pip install ocr4all-pixel-classifier[tf_cpu] downloads and installs tensorflow-2.5.0-cp36-cp36m-manylinux2010_x86_64.whl.

Is this intended? (We need to know for ocrd_all.)

crater2150 commented 3 years ago

I know that the code works with current Tensorflow versions, I think that those lines are still there because of https://github.com/ocr-d-modul-2-segmentierung/ocrd-pixelclassifier-segmentation/issues/7. @kba can you confirm if that problem still exists? If not, I could remove the whole extras_require block, as starting with TF 2.1 there aren't separate packages for CPU/GPU anymore.

bertsky commented 3 years ago

I know that the code works with current Tensorflow versions

Oh, I see. So your code and models work irrespective of the many breaking changes throughout TF 2.1 - 2.3 - 2.4 - 2.5? (In that case, we'll group the module alongside ocrd_calamari, which also works with most recent TF to my knowledge...)

I think that those lines are still there because of ocr-d-modul-2-segmentierung/ocrd-pixelclassifier-segmentation#7

Understood. Thanks for explaining!

I could remove the whole extras_require block, as starting with TF 2.1 there aren't separate packages for CPU/GPU anymore.

Sounds good. Having it as a regular requirement (without the need for a makefile or feature specifier) would also simplify installation.

crater2150 commented 3 years ago

Oh, I see. So your code and models work irrespective of the many breaking changes throughout TF 2.1 - 2.3 - 2.4 - 2.5? (In that case, we'll group the module alongside ocrd_calamari, which also works with most recent TF to my knowledge...)

I'll check if models actually work cross version, I hadn't tried that yet, but training and running using the same version definitely works without problems. The Tensorflow part doesn't use anything mentioned in TF's Release Notes as a breaking change.

crater2150 commented 3 years ago

The models included in the ocrd frontend work fine with TF 2.5.0, so I changed the requirements to allow Tensorflow >= 2.0 to <= 2.5, removed extras_require and updated the README accordingly.

bertsky commented 3 years ago

Thank you very much, @crater2150!

bertsky commented 3 years ago

Uhm, could you please update https://github.com/ocr-d-modul-2-segmentierung/ocrd-pixelclassifier-segmentation accordingly (setup.py and Makefile)?

crater2150 commented 3 years ago

Ah right. Should be fixed now

bertsky commented 3 years ago

Sorry to bring this up again, but are you quite sure you also tried TF 2.5 when you checked compatibility with recent versions? It's quite recent (no GH release yet but already on PyPI) and pulls in h5py 3.1, which is usually backwards incompatible when deserializing older models (because of changes to the string coding).

crater2150 commented 3 years ago

Yes, the virtualenv I tested in has the following versions:

 % pip list | grep 'tensorflow\|h5py'
h5py                              3.1.0
tensorflow                        2.5.0
tensorflow-estimator              2.5.0

crater2150 commented 3 years ago

Looking at the documentation of breaking changes in h5py 3.0, I think that the file encoding did not change, but only how the library returns it in its API (UTF-8 is now returned as bytes instead of str). We don't use h5py directly, so I'd assume the neccessary changes are included in Tensorflow.

bertsky commented 3 years ago

Yes, the virtualenv I tested in has the following versions:

Ok, great. Thanks again @crater2150!

Looking at the documentation of breaking changes in h5py 3.0, I think that the file encoding did not change, but only how the library returns it in its API (UTF-8 is now returned as bytes instead of str). We don't use h5py directly, so I'd assume the neccessary changes are included in Tensorflow.

Oh, I see. That's good to know. So TF and Keras could make a workaround for this – if they were willing to provide basic backwards compatibility (which unfortunately they are clearly not, judging by their frequency of breaking changes and the extremely narrow Python / Numpy / CUDA version range compatibility of their releases/prebuilds).