qurator-spk / sbb_binarization

Document Image Binarization
Apache License 2.0
67 stars 14 forks source link

Model won't load on Python 3.9 #41

Closed LudovA closed 1 year ago

LudovA commented 1 year ago

Hey,

After using this model for a while and having quite remarkable results as compared to standard binarization techniques, I would like to move to a newer version of python: 3.9.

Unfortunately, the model won't load then as I get a ValueError: bad marshal data (unknown type code). To fix this I need the raw SBB model and load the weights there and save again in the newer python version.

Is anyone aware of what the exact model is or where I can find it?

Thanks! LudovA

apacha commented 1 year ago

I've found a working solution for this, but it needs a bit help from the maintainers. I've loaded the models with Python 3.6 and saved them again in TF-format, which also works in Python 3.9 (as this format doesn't use the marshal library). The converted models can be found in my fork as a release: https://github.com/apacha/sbb_binarization/releases/tag/pre-trained-models

If @kba or @cneud could download the files from there and create a similar release in this Repository, we could kill three birds with a stone:

vahidrezanezhad commented 1 year ago

Hey,

After using this model for a while and having quite remarkable results as compared to standard binarization techniques, I would like to move to a newer version of python: 3.9.

Unfortunately, the model won't load then as I get a ValueError: bad marshal data (unknown type code). To fix this I need the raw SBB model and load the weights there and save again in the newer python version.

Is anyone aware of what the exact model is or where I can find it?

Thanks! LudovA

Dear LudovA,

uner https://github.com/qurator-spk/sbb_pixelwise_segmentation, just run python build_model_load_pretrained_weights_and_save.py script with modified inputs. This may resolve your issue :)

mikegerber commented 1 year ago
* Have a reliable source for the pre-trained models that is less likely to disappear than the current location for the models
* Have a reliable source (Github) that can be used for dynamically downloading the libraries on the fly

Have you had any ongoing issues with the current location of the model files?

apacha commented 1 year ago

Have you had any ongoing issues with the current location of the model files?

Not with this specific model, but with many datasets and models in the past. Putting models on GitHub is a good practice as it's one of the locations that are very likely to remain available for years, even if the projects that have created the data have ended.

mikegerber commented 1 year ago

That may be but it's a problem unrelated to the version problem.

cneud commented 1 year ago

Putting models on GitHub is a good practice as it's one of the locations that are very likely to remain available for years, even if the projects that have created the data have ended.

We aim to move the models to a new and sustainable SBB server location soon, plus there is also a version on Huggingface hub ;)

LudovA commented 1 year ago

Thanks everyone for looking into this topic.

@vahidrezanezhad this is indeed what I was looking for, more specifically, the resnet50_unetresnet50_unet function in models.py. Now I will just initialise the model from that function and load the weights instead of loading the full model from the .h5 file.