faustomorales / keras-ocr

A packaged and flexible version of the CRAFT text detector and Keras CRNN recognition model.
https://keras-ocr.readthedocs.io/
MIT License
1.37k stars 349 forks source link

Fine-tuning the recogniser on a bigger image. #176

Open sharonibejih opened 2 years ago

sharonibejih commented 2 years ago

Heyy!

I'm trying to finetune the recogniser, but with a different input shape of (500, 100). I cropped all my images to this shape for uniformity, because some contain a very long line of text of up to 70 letters, while some are quite shorter with about 4-15 letters.

If I choose to do exactly what is in the docs but changing my alphabets, that is this:

recognizer = keras_ocr.recognition.Recognizer(alphabet=alphabet)
recognizer.compile()

Then I get this error when I start training:

UnknownError: 2 root error(s) found.
  (0) Unknown:  AssertionError: A sentence is longer than this model can predict.
Traceback (most recent call last):

  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/script_ops.py", line 249, in __call__
    ret = func(*args)

  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/autograph/impl/api.py", line 645, in wrapper
    return func(*args, **kwargs)

  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 892, in generator_py_func
    values = next(generator_state.get_iterator(iterator_id))

  File "/usr/local/lib/python3.7/dist-packages/keras/engine/data_adapter.py", line 822, in wrapped_generator
    for data in generator_fn():

  File "/usr/local/lib/python3.7/dist-packages/keras_ocr/recognition.py", line 378, in get_batch_generator
    for sentence in sentences), 'A sentence is longer than this model can predict.'

AssertionError: A sentence is longer than this model can predict.

I'm assuming it's because the length of my images/labels differs from what the pretrained weights expect. So, I tried this:

recognizer = keras_ocr.recognition.Recognizer(
    alphabet=alphabet,
    build_params={**keras_ocr.recognition.DEFAULT_BUILD_PARAMS, 'width':500, 'height':100, 'stn': False},
)

recognizer.compile() 

Then I get this error message, right after running the cell:

Provided alphabet does not match pretrained alphabet. Using backbone weights only.
Looking for /root/.keras-ocr/crnn_kurapan_notop.h5
Downloading /root/.keras-ocr/crnn_kurapan_notop.h5
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-3137b7304f8c> in <module>()
      2 recognizer = keras_ocr.recognition.Recognizer(
      3     alphabet=alphabet,
----> 4     build_params={**keras_ocr.recognition.DEFAULT_BUILD_PARAMS, 'width':500, 'height':100, 'stn': False},
      5 )
      6 

2 frames
/usr/local/lib/python3.7/dist-packages/keras_ocr/recognition.py in __init__(self, alphabet, weights, build_params)
    337                     tools.download_and_verify(url=weights_dict['weights']['notop']['url'],
    338                                               filename=weights_dict['weights']['notop']['filename'],
--> 339                                               sha256=weights_dict['weights']['notop']['sha256']))
    340 
    341     def get_batch_generator(self, image_generator, batch_size=8, lowercase=False):

/usr/local/lib/python3.7/dist-packages/keras/engine/training.py in load_weights(self, filepath, by_name, skip_mismatch, options)
   2359               f, self.layers, skip_mismatch=skip_mismatch)
   2360         else:
-> 2361           hdf5_format.load_weights_from_hdf5_group(f, self.layers)
2362 
   2363     # Perform any layer defined finalization of the layer state.

/usr/local/lib/python3.7/dist-packages/keras/saving/hdf5_format.py in load_weights_from_hdf5_group(f, layers)
    689                      'containing ' + str(len(layer_names)) +
    690                      ' layers into a model with ' + str(len(filtered_layers)) +
--> 691                      ' layers.')
    692 
    693   # We batch weight value assignments in a single backend call

ValueError: You are trying to load a weight file containing 16 layers into a model with 15 layers.

Is there a way I can tweak the model architecture to accept my input shape? Or is there a way to resize to (200, 31) right within the architecture?

I wouldn't want to train by setting the weights to None, even though that seems to be the only thing working for me.

Thanks!

bilalltf commented 2 years ago

Hey @sharonibejih, It doesn't work for me, even with the weights set to None. Did you find a solution?

bhattarai333 commented 1 year ago

@bilalltf @sharonibejih Have either of you found a solution to this issue?