use predict_generator to better utilize GPU

bertsky commented 3 years ago

When the model is applied in patch mode (the default), a loop over the windows is run (on CPU / in Numpy) and passed to model.predict() as a single image each (on GPU / in Keras).

https://github.com/qurator-spk/sbb_binarization/blob/8dd05064b2dbdc7d4bdfb8896251302e8ec5ecb3/sbb_binarize/sbb_binarize.py#L152

This does not utilize the GPU for two reasons:

the effective batch size of 1 might be too low for the number of shaders and size of GPURAM
the GPU kernel can only run briefly and has to wait for the CPU each time (patch cropping and memory paging)

I suggest changing the following:

Define a generator function doing the patching/cropping. It should be a thread-safe formulation, e.g. a keras.utils.Sequence.
Pass that to predict_generator instead of predict to get concurrent CPU / GPU computation.
Allow parameterizing the number of workers and batch size to allow optimal adaptation to the concrete hardware and crop/model sizes.

bertsky commented 3 years ago

Spoiler: I know how to do this. Would you care for a PR?

vahidrezanezhad commented 3 years ago

Spoiler: I know how to do this. Would you care for a PR?

@bertsky I appreciate it if you do that :)

apacha commented 1 year ago

@bertsky did you ever complete this improvement? Maybe on a fork? I would like to run this binarization on a large dataset and with the current procedure it is simply too slow (10-20 images per minute).

qurator-spk / sbb_binarization

use predict_generator to better utilize GPU #32