Closed AakashKumarNain closed 3 years ago
Let's drop all the punctuation marks for now. Punctuations like , or . are hard to differentiate for the model. We need to come up with a better solution for them
Agreed. But given the number of occurrences of such characters in the current dataset I didn't see any problems. If weird problems start stemming we can always drop them.
I can start working on the second one. Would you like to take the third one?
Sure, I will take up the third one.
Cool. I will drop a PR when I am done and ask for your review.
@AakashKumarNain I tried porting the non-TF ops to pure TF but I ran into a problem during distortion_free_resize()
since Tensor assignment is not supported. For now, I have ported most of the things apart from that one (which is realized with tf.py_function()
for now).
Here's the Colab Gist of how the preprocessing pipeline is looking currently.
tf.image.resize()
(followed by a tf.transpose()
):Let me know your thoughts.
Cool. Let me try to do that without tf.py_function()
@sayakpaul one question for you: This is just resizing while keeping the aspect ratio same. Can't you use the resize op directly?
If we set preserve_aspect_ratio=True
there then we don't have uniform image sizes anymore which won't be supported for mini-batching. So, distortion_free_resize()
first ensures the aspect ratio is maintained to the highest dimension of the input image which is then copied to the desired shape.
This should suffice your use case: https://colab.research.google.com/drive/1XVmtHNoY4__v786ynvPHmexj3-D-C6Bb?usp=sharing
Thanks @AakashKumarNain. Looks much better than what I was doing. Do you want to pull this in during incorporating TextVectorization
?
@sayakpaul here is the complete input pipeline. Lemme know if this is good enough for your use case
https://colab.research.google.com/drive/1PV33aw5fI1CTZEJ6-jyYqv3JAKtYTEAY?usp=sharing
@AakashKumarNain https://github.com/sayakpaul/Handwriting-Recognizer-in-Keras/pull/2
,
or.
are hard to differentiate for the model. We need to come up with a better solution for themcv2
orimgaug
as the things that are implemented in the code can be implemented using TF image ops easily. It removes the external dependency on two librariesTextVectorization
layer.