Closed andbue closed 1 month ago
This is very annoying... As far as I know this is a problem of the hdf5 file which is not backwards-compatible, so opening a python 3.7 model in python 3.8 is possible, but not vice-versa. I don't think that there is a solution.
Probably, we @andbue @chreul should take care to create the shared models in python 3.7,
I created the model in 3.7 and was not able to load it in 3.8, unfortunately...
I'm running into a similar problem unfortunately. Using calamari 2.2.2 and Python 3.9, I seem to be unable to load pretrained models. Both on CPU (clean install via pip from scratch via "pip install calamari_ocr==2.2.2") and GPU environments (using TensorFlow 2.6), I get the following error when trying to load the idiotikon model:
File "/u/liebl/miniconda3/envs/origami_cpu/lib/python3.9/site-packages/calamari_ocr/ocr/predict/predictor.py", line 53, in from_paths
multi_predictor = super(MultiPredictor, cls).from_paths(
File "/u/liebl/miniconda3/envs/origami_cpu/lib/python3.9/site-packages/tfaip/predict/multimodelpredictor.py", line 107, in from_paths
models = [
File "/u/liebl/miniconda3/envs/origami_cpu/lib/python3.9/site-packages/tfaip/predict/multimodelpredictor.py", line 108, in <listcomp>
keras.models.load_model(model, compile=False, custom_objects=scenario.model_cls().all_custom_objects())
File "/u/liebl/miniconda3/envs/origami_cpu/lib/python3.9/site-packages/keras/saving/save.py", line 200, in load_model
return hdf5_format.load_model_from_hdf5(filepath, custom_objects,
File "/u/liebl/miniconda3/envs/origami_cpu/lib/python3.9/site-packages/keras/saving/hdf5_format.py", line 180, in load_model_from_hdf5
model = model_config_lib.model_from_config(model_config,
File "/u/liebl/miniconda3/envs/origami_cpu/lib/python3.9/site-packages/keras/saving/model_config.py", line 52, in model_from_config
return deserialize(config, custom_objects=custom_objects)
File "/u/liebl/miniconda3/envs/origami_cpu/lib/python3.9/site-packages/keras/layers/serialization.py", line 208, in deserialize
return generic_utils.deserialize_keras_object(
File "/u/liebl/miniconda3/envs/origami_cpu/lib/python3.9/site-packages/keras/utils/generic_utils.py", line 674, in deserialize_keras_object
deserialized_obj = cls.from_config(
File "/u/liebl/miniconda3/envs/origami_cpu/lib/python3.9/site-packages/keras/engine/functional.py", line 662, in from_config
input_tensors, output_tensors, created_layers = reconstruct_from_config(
File "/u/liebl/miniconda3/envs/origami_cpu/lib/python3.9/site-packages/keras/engine/functional.py", line 1273, in reconstruct_from_config
process_layer(layer_data)
File "/u/liebl/miniconda3/envs/origami_cpu/lib/python3.9/site-packages/keras/engine/functional.py", line 1255, in process_layer
layer = deserialize_layer(layer_data, custom_objects=custom_objects)
File "/u/liebl/miniconda3/envs/origami_cpu/lib/python3.9/site-packages/keras/layers/serialization.py", line 208, in deserialize
return generic_utils.deserialize_keras_object(
File "/u/liebl/miniconda3/envs/origami_cpu/lib/python3.9/site-packages/keras/utils/generic_utils.py", line 674, in deserialize_keras_object
deserialized_obj = cls.from_config(
File "/u/liebl/miniconda3/envs/origami_cpu/lib/python3.9/site-packages/keras/layers/core.py", line 1005, in from_config
function = cls._parse_function_from_config(
File "/u/liebl/miniconda3/envs/origami_cpu/lib/python3.9/site-packages/keras/layers/core.py", line 1057, in _parse_function_from_config
function = generic_utils.func_load(
File "/u/liebl/miniconda3/envs/origami_cpu/lib/python3.9/site-packages/keras/utils/generic_utils.py", line 789, in func_load
code = marshal.loads(raw_code)
ValueError: bad marshal data (unknown type code)
Hi Bernhard! With https://github.com/Calamari-OCR/calamari/commit/2fa93d880fd306bdb2171bf4ed5e4538cc0dc79f I implemented the savedmodel format instead of the h5 files. It's a folder of different files and takes up more disk space, but I hope that it will solve the compatibility problems between different versions of python. It's only in the tempscale branch at the moment and not really tested, but if you've got the time at hand you could give it a try (loading the models with py37, waiting for them to be converted to version 6, then using them in py39). If it works, I could merge that into master and later update the models in this repo.
Note: I have converted all models here and in calamari_models_experimental to v6 (SavedModel format) and created releases with tarballs as assets. Closing here – but mind that for the time being you'll have to install Calamari from git instead of PyPI because we have not released 2.3 with the new feature yet.
When loading keras models, the python version needs to be equal between the system the model was trained on and the system loading the file (cf. https://github.com/keras-team/keras/issues/7440). I stumbled upon this when transferring models for inference to another machine running 3.8 instead of 3.7. Would't it be helpful to include this version in the json and provide some more useful error message based on that information? Is there a way to load and save the models in a way that updates them to another python version?