Closed bertsky closed 3 years ago
I know that the code works with current Tensorflow versions, I think that those lines are still there because of https://github.com/ocr-d-modul-2-segmentierung/ocrd-pixelclassifier-segmentation/issues/7. @kba can you confirm if that problem still exists?
If not, I could remove the whole extras_require
block, as starting with TF 2.1 there aren't separate packages for CPU/GPU anymore.
I know that the code works with current Tensorflow versions
Oh, I see. So your code and models work irrespective of the many breaking changes throughout TF 2.1 - 2.3 - 2.4 - 2.5? (In that case, we'll group the module alongside ocrd_calamari, which also works with most recent TF to my knowledge...)
I think that those lines are still there because of ocr-d-modul-2-segmentierung/ocrd-pixelclassifier-segmentation#7
Understood. Thanks for explaining!
I could remove the whole
extras_require
block, as starting with TF 2.1 there aren't separate packages for CPU/GPU anymore.
Sounds good. Having it as a regular requirement (without the need for a makefile or feature specifier) would also simplify installation.
Oh, I see. So your code and models work irrespective of the many breaking changes throughout TF 2.1 - 2.3 - 2.4 - 2.5? (In that case, we'll group the module alongside ocrd_calamari, which also works with most recent TF to my knowledge...)
I'll check if models actually work cross version, I hadn't tried that yet, but training and running using the same version definitely works without problems. The Tensorflow part doesn't use anything mentioned in TF's Release Notes as a breaking change.
The models included in the ocrd frontend work fine with TF 2.5.0, so I changed the requirements to allow Tensorflow >= 2.0 to <= 2.5, removed extras_require
and updated the README accordingly.
Thank you very much, @crater2150!
Uhm, could you please update https://github.com/ocr-d-modul-2-segmentierung/ocrd-pixelclassifier-segmentation accordingly (setup.py and Makefile)?
Ah right. Should be fixed now
Sorry to bring this up again, but are you quite sure you also tried TF 2.5 when you checked compatibility with recent versions? It's quite recent (no GH release yet but already on PyPI) and pulls in h5py 3.1, which is usually backwards incompatible when deserializing older models (because of changes to the string coding).
Yes, the virtualenv I tested in has the following versions:
% pip list | grep 'tensorflow\|h5py'
h5py 3.1.0
tensorflow 2.5.0
tensorflow-estimator 2.5.0
Looking at the documentation of breaking changes in h5py 3.0, I think that the file encoding did not change, but only how the library returns it in its API (UTF-8 is now returned as bytes
instead of str
). We don't use h5py directly, so I'd assume the neccessary changes are included in Tensorflow.
Yes, the virtualenv I tested in has the following versions:
Ok, great. Thanks again @crater2150!
Looking at the documentation of breaking changes in h5py 3.0, I think that the file encoding did not change, but only how the library returns it in its API (UTF-8 is now returned as
bytes
instead ofstr
). We don't use h5py directly, so I'd assume the neccessary changes are included in Tensorflow.
Oh, I see. That's good to know. So TF and Keras could make a workaround for this – if they were willing to provide basic backwards compatibility (which unfortunately they are clearly not, judging by their frequency of breaking changes and the extremely narrow Python / Numpy / CUDA version range compatibility of their releases/prebuilds).
The version currently published on PyPI (0.6.5) pulls in Tensorflow 2.5, which conflicts with the allowed range in the setup:
https://github.com/ocr-d-modul-2-segmentierung/page-segmentation/blob/e60677447d7ebe58c49ddb2e8a5ac242574d7c53/setup.py#L18-L21
Thus,
pip install ocr4all-pixel-classifier[tf_cpu]
downloads and installs tensorflow-2.5.0-cp36-cp36m-manylinux2010_x86_64.whl.Is this intended? (We need to know for ocrd_all.)