Closed mikegerber closed 1 year ago
Ah, I was testing the branch transformer_model_integration
which has different requirement compared to the master:
--- a/requirements.txt
+++ b/requirements.txt
@@ -2,4 +2,4 @@ numpy
setuptools >= 41
opencv-python-headless
ocrd >= 2.22.3
-tensorflow >= 2.4.0
+tensorflow == 2.4.*
Something to keep in mind when merging?
Using Python 3.10 I get:
% sbb_binarize --model-dir ~/devel/qurator-data/sbb_binarization/2022-08-16/ --patches OCR-D-IMG_00000024.tif OCR-D-IMG_00000024.out.tif
Traceback (most recent call last):
File "/home/mike/.virtualenvs/sbb_binarization_transformer_model_integration/bin/sbb_binarize", line 33, in <module>
sys.exit(load_entry_point('sbb-binarization', 'console_scripts', 'sbb_binarize')())
File "/home/mike/.virtualenvs/sbb_binarization_transformer_model_integration/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/home/mike/.virtualenvs/sbb_binarization_transformer_model_integration/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/home/mike/.virtualenvs/sbb_binarization_transformer_model_integration/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/mike/.virtualenvs/sbb_binarization_transformer_model_integration/lib/python3.10/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/home/mike/devel/2022-08 eval sbb_binarization_transformer/sbb_binarization/sbb_binarize/cli.py", line 15, in main
SbbBinarizer(model_dir).run(image_path=input_image, use_patches=patches, save=output_image)
File "/home/mike/devel/2022-08 eval sbb_binarization_transformer/sbb_binarization/sbb_binarize/sbb_binarize.py", line 98, in __init__
self.models.append(self.load_model(model_file))
File "/home/mike/devel/2022-08 eval sbb_binarization_transformer/sbb_binarization/sbb_binarize/sbb_binarize.py", line 117, in load_model
model = load_model(join(self.model_dir, model_name) , compile=False,custom_objects = {"PatchEncoder": PatchEncoder, "Patches": Patches})
File "/home/mike/.virtualenvs/sbb_binarization_transformer_model_integration/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/mike/.virtualenvs/sbb_binarization_transformer_model_integration/lib/python3.10/site-packages/keras/utils/generic_utils.py", line 793, in func_load
code = marshal.loads(raw_code)
ValueError: bad marshal data (unknown type code)
I noticed another problem: Using Python 3.6 I get no console output (which is good), when trying to use Python 3.7 I get lots of seemingly useless progress bars like this:
Using Python 3.10 I get:
... ValueError: bad marshal data (unknown type code)
IIRC that's due to Keras model format using some native Python serialization which is version dependent. If models were converted to TensorFlow SavedModel format – which can be as simple as loading (on the right Python/Keras version) and saving (with the right extension) – then this should be much more interoperable.
IIRC that's due to Keras model format using some native Python serialization which is version dependent. If models were converted to TensorFlow SavedModel format – which can be as simple as loading (on the right Python/Keras version) and saving (with the right extension) – then this should be much more interoperable.
I just verified that for this package.
from tensorflow.keras.models import load_model
m = load_model('/path/to/model_bin_sbb_ens.h5', compile=False)
m.save('/path/to/model_bin_sbb_ens')
patch the H5 loader … https://github.com/qurator-spk/sbb_binarization/blob/f11d0b0bf741253c55930c34e58e7e10718cb652/sbb_binarize/sbb_binarize.py#L36-L38 … roughly like so:
@@ -35,6 +35,8 @@
self.model_files = glob('%s/*.h5' % self.model_dir)
if not self.model_files:
+ self.model_files = glob('%s/*/' % self.model_dir)
+ if not self.model_files:
raise ValueError(f"No models found in {self.model_dir}")
self.models = []
Yes, the same was also reported (and models converted by @apacha who kindly already did the conversion here.
I have also published the saved_model
to the Huggingface hub: https://huggingface.co/SBB/sbb_binarization, but we still have to update the resmgr
accordingly.
So wrt to the OP, with https://github.com/qurator-spk/sbb_binarization/pull/59 we currently support Python 3.7-3.10. Will update the Readme accordingly.
sbb_binarization currently needs TensorFlow 2.4, which is not available* for Python 3.10, the default on my Linux installation. Which versions are supported?