OCR-D / ocrd_all

Master repository which includes most other OCR-D repositories as submodules
MIT License
72 stars 17 forks source link

update ocrd_pc_segmentation, move from tf21 to tf2 #264

Closed bertsky closed 3 years ago

bertsky commented 3 years ago

To prevent pc_segmentation from dragging in TF 2.5 and thus invalidating that venv for other modules.

bertsky commented 3 years ago

See https://github.com/ocr-d-modul-2-segmentierung/page-segmentation/issues/4 for details.

bertsky commented 3 years ago

Damn. Placing ocrd_pc_segmentation in sub-venv/headless-tf2 creates another conflict around h5py (ocrd_calamari wants <3, but ocr4all-pixel-classifier drags in 3.1).

bertsky commented 3 years ago

Placing ocrd_pc_segmentation in sub-venv/headless-tf2 creates another conflict around h5py (ocrd_calamari wants <3, but ocr4all-pixel-classifier drags in 3.1).

It's not ocrd_pc_segmentation's or ocr4all-pixel-classifier's fault: they simply allow running the most recent TF, which happens to be incompatible with current ocrd_calamari. Unfortunately, the latter does not advertise this dependency. (See fix for the latter here and analysis of the former here.)

I see no way around this other than using different venvs for ocrd_calamari (headless-tf24) and ocrd_pc_segmentation (headless-tf2) – which will cost us another 2GB in the maximum image.

Unless anyone has a better idea?

kba commented 3 years ago

To summarize:

So, we need a new release of ocrd_calamari and the only remaining tf21 processor would be ocrd_anybaseocr?

bertsky commented 3 years ago

To summarize:

yes

yes

  • ocrd_calamari can be moved to tf2

no, ocrd_calamari is already in headless-tf2, but it can stay there along with ocrd_pc_segmentation if ...

and the h5py downgrade is redundant

yes, the combination h5py<3 and TF<2.5 is overly restrictive, and prevents sharing the venv with ocrd_pc_segmentation. (We would need a tf24 venv for no gain.)

So, we need a new release of ocrd_calamari and the only remaining tf21 processor would be ocrd_anybaseocr?

yes

bertsky commented 3 years ago

Ok, I have removed the h5py hack for headless-tf2 and switched to https://github.com/OCR-D/ocrd_calamari/commit/76b34c50cb1dfa10d60e6acf67dc21ad9ed0468a now – let's see if it works.

bertsky commented 3 years ago

Ok, I have removed the h5py hack for headless-tf2 and switched to OCR-D/ocrd_calamari@76b34c5 now – let's see if it works.

No. Seems like ocrd_pc_segmentation needs another fix. Let's see...