A few things to note - Githubissues

sayakpaul / Handwriting-Recognizer-in-Keras

This project shows how to build a simple handwriting recognizer in Keras with the IAM dataset.

Apache License 2.0

13 stars 2 forks source link

A few things to note #1

Closed AakashKumarNain closed 3 years ago

AakashKumarNain commented 3 years ago

Let's drop all the punctuation marks for now. Punctuations like , or . are hard to differentiate for the model. We need to come up with a better solution for them
We don't need to use cv2 or imgaug as the things that are implemented in the code can be implemented using TF image ops easily. It removes the external dependency on two libraries
Instead of tokenizer, we can use the TextVectorization layer.

sayakpaul commented 3 years ago

Let's drop all the punctuation marks for now. Punctuations like , or . are hard to differentiate for the model. We need to come up with a better solution for them

Agreed. But given the number of occurrences of such characters in the current dataset I didn't see any problems. If weird problems start stemming we can always drop them.

I can start working on the second one. Would you like to take the third one?

AakashKumarNain commented 3 years ago

Sure, I will take up the third one.

sayakpaul commented 3 years ago

Cool. I will drop a PR when I am done and ask for your review.

sayakpaul commented 3 years ago

@AakashKumarNain I tried porting the non-TF ops to pure TF but I ran into a problem during distortion_free_resize() since Tensor assignment is not supported. For now, I have ported most of the things apart from that one (which is realized with tf.py_function() for now).

Here's the Colab Gist of how the preprocessing pipeline is looking currently.

Here is how the distortion-free resizing looks like:

With vanilla `tf.image.resize()` (followed by a `tf.transpose()`):

Let me know your thoughts.

AakashKumarNain commented 3 years ago

Cool. Let me try to do that without tf.py_function()

AakashKumarNain commented 3 years ago

@sayakpaul one question for you: This is just resizing while keeping the aspect ratio same. Can't you use the resize op directly?

sayakpaul commented 3 years ago

If we set preserve_aspect_ratio=True there then we don't have uniform image sizes anymore which won't be supported for mini-batching. So, distortion_free_resize() first ensures the aspect ratio is maintained to the highest dimension of the input image which is then copied to the desired shape.

AakashKumarNain commented 3 years ago

This should suffice your use case: https://colab.research.google.com/drive/1XVmtHNoY4__v786ynvPHmexj3-D-C6Bb?usp=sharing

sayakpaul commented 3 years ago

Thanks @AakashKumarNain. Looks much better than what I was doing. Do you want to pull this in during incorporating TextVectorization?

AakashKumarNain commented 3 years ago

@sayakpaul here is the complete input pipeline. Lemme know if this is good enough for your use case

https://colab.research.google.com/drive/1PV33aw5fI1CTZEJ6-jyYqv3JAKtYTEAY?usp=sharing

sayakpaul commented 3 years ago

@AakashKumarNain https://github.com/sayakpaul/Handwriting-Recognizer-in-Keras/pull/2

sayakpaul / Handwriting-Recognizer-in-Keras

A few things to note #1

Here is how the distortion-free resizing looks like:

With vanilla tf.image.resize() (followed by a tf.transpose()):

With vanilla `tf.image.resize()` (followed by a `tf.transpose()`):