qurator-spk / eynollah

Document Layout Analysis
Apache License 2.0
328 stars 26 forks source link

try loading as TF SavedModel instead of HDF5 #91

Closed bertsky closed 1 year ago

bertsky commented 1 year ago

In trying to solve #87, I have converted the models and changed the loader by trying to omit the .h5 suffix.

Unfortunately, I now get a new error:

00:49:28.543 INFO eynollah - INPUT FILE PHYS_0005 (1/19) 
00:49:28.824 INFO eynollah - Resizing and enhancing image...
00:49:28.825 INFO eynollah - Detected 300 DPI
Traceback (most recent call last):
  File "/local/ocr-d/ocrd_all/venv/bin/ocrd-eynollah-segment", line 8, in <module>
    sys.exit(main())
  File "/local/ocr-d/ocrd_all/venv/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/local/ocr-d/ocrd_all/venv/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/local/ocr-d/ocrd_all/venv/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/local/ocr-d/ocrd_all/venv/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/local/ocr-d/ocrd_all/venv/lib/python3.8/site-packages/qurator/eynollah/ocrd_cli.py", line 8, in main
    return ocrd_cli_wrap_processor(EynollahProcessor, *args, **kwargs)
  File "/local/ocr-d/ocrd_all/venv/lib/python3.8/site-packages/ocrd/decorators/__init__.py", line 117, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "/local/ocr-d/ocrd_all/venv/lib/python3.8/site-packages/ocrd/processor/helpers.py", line 107, in run_processor
    processor.process()
  File "/local/ocr-d/ocrd_all/venv/lib/python3.8/site-packages/qurator/eynollah/processor.py", line 58, in process
    Eynollah(**eynollah_kwargs).run()
  File "/local/ocr-d/ocrd_all/venv/lib/python3.8/site-packages/qurator/eynollah/eynollah.py", line 2310, in run
    img_res, is_image_enhanced, num_col_classifier, num_column_is_classified = self.run_enhancement()
  File "/local/ocr-d/ocrd_all/venv/lib/python3.8/site-packages/qurator/eynollah/eynollah.py", line 1993, in run_enhancement
    is_image_enhanced, img_org, img_res, num_col_classifier, num_column_is_classified, img_bin = self.resize_and_enhance_image_with_column_classifier()
  File "/local/ocr-d/ocrd_all/venv/lib/python3.8/site-packages/qurator/eynollah/eynollah.py", line 408, in resize_and_enhance_image_with_column_classifier
    _, page_coord = self.early_page_for_num_of_column_classification(img_bin)
  File "/local/ocr-d/ocrd_all/venv/lib/python3.8/site-packages/qurator/eynollah/eynollah.py", line 654, in early_page_for_num_of_column_classification
    img_page_prediction = self.do_prediction(False, img, model_page)
  File "/local/ocr-d/ocrd_all/venv/lib/python3.8/site-packages/qurator/eynollah/eynollah.py", line 528, in do_prediction
    img_height_model = model.layers[len(model.layers) - 1].output_shape[1]
  File "/local/ocr-d/ocrd_all/venv/lib/python3.8/site-packages/keras/engine/base_layer.py", line 2132, in output_shape
    raise AttributeError(
AttributeError: The layer "activation_56" has never been called and thus has no defined output shape.

Given that this seems to be the output layer, I am at a loss here. Creating as draft only.

Ideas anyone?

bertsky commented 1 year ago

@vahidrezanezhad could it be that the way the layers get defined in sbb_pixelwise_segmentation should use input_shape as kwarg instead of data_format?

EDIT: sorry, that was a bullshit question

bertsky commented 1 year ago

Ah, here's the reason:

WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory.

So I must do the model conversion on Python <= 3.7 but TF >= 2.5 ...

bertsky commented 1 year ago

Ok, it does work.

I can also update ocrd-tool.json here to include the changed resource URL. But first I need some place to upload it permanently. How about qurator-data.de?

(@kba knows where to find the new archive...)

cneud commented 1 year ago

@bertsky Thank you so much. If you can upload the saved model somewhere shareable, we can publish it to qurator-data. (edit: will ask @kba for the location)

btw sorry for being unresponsive here, we are currently fighting the ICDAR2023 submission deadline for a paper on...Eynollah ;)

bertsky commented 1 year ago

BTW, CI failure is because there is no TF2 for Py 3.6 anymore. Config in GH Actions should be changed (back) to 3.7 or 3.8.

cneud commented 1 year ago

BTW, CI failure is because there is no TF2 for Py 3.6 anymore. Config in GH Actions should be changed (back) to 3.7 or 3.8.

This is what I tried in https://github.com/qurator-spk/eynollah/pull/90.

mikegerber commented 1 year ago

@kba asked me to upload the updated model, but 1. I'd prefer @vahidrezanezhad to do it 2. that he produces the updated models himself. Is this just about - skims the issue comments - being able to load the files with a newer Python/TF version? Then I could maybe be persuaded to do it.

Not trying to be a pedant, it's just professional diligence before I upload some variant of some model I don't fully understand myself.

mikegerber commented 1 year ago

(Alternatively, I offered @kba access to the relevant resources (our internal git annex + qurator-data.de), then the diligence is up to him.)

mikegerber commented 1 year ago

Skimming the PR, it's not only about loading the model, but some other stuff as well? (Cleaning up and a change of a larger code part)

mikegerber commented 1 year ago

Skimming the PR, it's not only about loading the model, but some other stuff as well? (Cleaning up and a change of a larger code part)

Alright, I had a look at the individual commits.

a. I would upload the data if necessary, b. reviewing, merging this PR must be done by someone else, there is too much going on

bertsky commented 1 year ago

@mikegerber

Is this just about - skims the issue comments - being able to load the files with a newer Python/TF version?

Initially, yes. But then it turned out there are more incompatibilities (newer Numpy). In order to be able to do the Numpy fixes, I had to understand some of the code, which in turn led to the cosmetic simplifications. The last commit returned to TF2 migration – there is no session management anymore, and no benefit in keeping this IMO. Especially reloading the models for every page seemed unnecessarily inefficient. (And yes, I did test, of course.)

Not trying to be a pedant, it's just professional diligence before I upload some variant of some model I don't fully understand myself.

Sure – I would also recommend to first allow @vahidrezanezhad to take a look, even if the changes are more or less cosmetic.

Regarding model upload: if you could do that first, then I could update this PR accordingly – adding the new URL to the local ocrd-tool.json, so it would immediately become available as the current model location (independent of old URLs registered in core).

(I really just loaded as HDF5 and stored as SavedFormat, nothing fancy.)

mikegerber commented 1 year ago

(And yes, I did test, of course.)

;-) As I will only be uploading the files, that's not my concern. (I would try to test myself if it was - had the tiniest change break something unexpected too often.)

Regarding model upload: if you could do that first, then I could update this PR accordingly – adding the new URL to the local ocrd-tool.json, so it would immediately become available as the current model location (independent of old URLs registered in core). (I really just loaded as HDF5 and stored as SavedFormat, nothing fancy.)

Yeah, I think @kba will give me the files and I'll upload them (or I give him access.)

mikegerber commented 1 year ago

Sure – I would also recommend to first allow @vahidrezanezhad to take a look, even if the changes are more or less cosmetic.

He's on vacation btw.

mikegerber commented 1 year ago

@cneud wants to discuss this with @kba (and me, i think) at our regular meeting, which is understandable.

cneud commented 1 year ago

It would be great if we can put the TF saved model versions of the models on qurator-data already (and possibly HF later as well), for testing. About the other changes and the PR, I would like to have some discussion to see if we can review/merge while Vahid is away.

So

a. I would upload the data if necessary,

Yes please.

b. reviewing, merging this PR must be done by someone else, there is too much going on

To be discussed.

mikegerber commented 1 year ago

a. I would upload the data if necessary, Yes please.

Still waiting for the data :) @kba, @bertsky?

b. reviewing, merging this PR must be done by someone else, there is too much going on To be discussed.

Yes, that's 100% "in meinem Sinne"/my opinion. I'd rather be not involved, as I am currently not familiar with the code base.

kba commented 1 year ago

Still waiting for the data :) @kba, @bertsky?

https://ocr-d.kba.cloud/2021-04-25.SavedModel.tar.gz

bertsky commented 1 year ago

Fixes #92 Fixes #87

mikegerber commented 1 year ago

https://qurator-data.de/eynollah/2021-04-25/SavedModel.tar.gz

I've created a short README, to document how this file was created, see https://qurator-data.de/eynollah/2021-04-25/. @bertsky Could you post a short code snippet here, how you converted this? Could be useful in the future.

@joergleh Relevant to your efforts, probably.

cneud commented 1 year ago

Many thanks @mikegerber!

@bertsky This should be the location for the model(s), rather than https://ocr-d.kba.cloud/2021-04-25.SavedModel.tar.gz, which was just for temporary file transfer, so https://github.com/qurator-spk/eynollah/pull/91/commits/00f431a5d6152aba96675b4f26d89f44bfdef211 would need to be adapted again.

bertsky commented 1 year ago

@mikegerber it boils down to:

import os
from tensorflow.keras.models import load_model
os.chdir('default')
for path in os.listdir():
    if not path.endswith('.h5'):
        continue
    model = load_model(path, compile=False)
    model.save(path[:-3], save_format="tf")
bertsky commented 1 year ago

@cneud done!

bertsky commented 1 year ago

Wait! I just found out my "cosmetic" changes are not so cosmetic after all … investigating.

mikegerber commented 1 year ago

@bertsky Thanks!

Where does the default directory name come from? I was wondering the same while examining @kba's tar. Is it OCR-D convention?

mikegerber commented 1 year ago

@bertsky Sorry for the laughing emoji :) Couldn't resist!

bertsky commented 1 year ago

@mikegerber

Where does the default directory name come from? I was wondering the same while examining @kba's tar. Is it OCR-D convention?

That's just the model name (i.e. last directory inside the archive = directory to unpack). Put your absolute or relative path there.

Sorry for the laughing emoji :) Couldn't resist!

No worries. I shouldn't have been so bold to begin with!

cneud commented 1 year ago

I just realized that the conversion .h5 -> TF SavedModel was done using the models from https://qurator-data.de/eynollah/2021-04-25/ where rather the models from https://qurator-data.de/eynollah/2022-04-05/ should have been used.

Other than having more comprehensible model names, the 2022-04-05 dir also contains an additional model eynollah-main-regions_20220314.h5 which is required for full functionality.

bertsky commented 1 year ago

@cneud,

I just realized that the conversion .h5 -> TF SavedModel was done using the models from https://qurator-data.de/eynollah/2021-04-25/ where rather the models from https://qurator-data.de/eynollah/2022-04-05/ should have been used.

Other than having more comprehensible model names, the 2022-04-05 dir also contains an additional model eynollah-main-regions_20220314.h5 which is required for full functionality.

you should know better than me, but as of Vahid's earlier response I think this only belongs to the (now public) eynollah_light branch.

I can convert these models as well, if you like, but this must be done similarly against that branch.

cneud commented 1 year ago

I see (the mess...).

But since that eynollah_light branch is "approved" despite merge conflicts, that is sth we likely have to keep waiting until Vahid is back. :confused:

bertsky commented 1 year ago

Wait! I just found out my "cosmetic" changes are not so cosmetic after all … investigating.

I removed these parts, rebased and force-pushed. (Sorry for the noise!)

bertsky commented 1 year ago

But since that eynollah_light branch is "approved" despite merge conflicts, that is sth we likely have to keep waiting until Vahid is back.

I have converted these models as well and placed them in the same location for @kba to upload. I suggest merging master into eynollah_light after this has been merged into master (as the easiest path AFAICS).

kba commented 1 year ago

But since that eynollah_light branch is "approved" despite merge conflicts, that is sth we likely have to keep waiting until Vahid is back.

I have converted these models as well and placed them in the same location for @kba to upload. I suggest merging master into eynollah_light after this has been merged into master (as the easiest path AFAICS).

Available from https://ocr-d.kba.cloud/2022-04-05.SavedModel.tar.gz

mikegerber commented 1 year ago

I refuse to upload 2022-04-05.SavedModel.tar.gz as the original is not fully in our internal data repository, sorry. This must be corrected by Vahid first.

Here, I am pedantic because it is not the first time this kind of situation comes up and while I am happy to support, I don't want to clean up all the time.

bertsky commented 1 year ago

Take your time! This PR can wait, and even more so any further action on eynollah_light. (I'll shut up now.)

mikegerber commented 1 year ago

No worries. I shouldn't have been so bold to begin with!

I appreciate the boldness, I just think you should have been bold in another PR ;) I know it's more work but for me, while reviewing a PR like this (even if it's just to understand the upload situation), it's more work if the PR contains more than just the changes necessary.

However, this is not my project, so it's not up to me, actually.

mikegerber commented 1 year ago

Side note: This PR seems to fix the CircleCI tests, it that correct? Great stuff if this is true!

bertsky commented 1 year ago

Side note: This PR seems to fix the CircleCI tests, it that correct? Great stuff if this is true!

No. This PR only fixes Python 3.8 onwards. The CI check comes with @cneud's config changes (throwing out broken 3.6, activating 3.7 onwards).

bertsky commented 1 year ago

Note, since we have moved towards Python 3.8 as base version, this is becoming rather urgent. See ocrd_all – where I am now considering switching to this PR version.