emedvedev / attention-ocr

A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.
MIT License
1.07k stars 257 forks source link

Sliding window's width on image? #144

Open githubpiyush opened 5 years ago

githubpiyush commented 5 years ago

The gif's you have included as output, can we change the width of those sliding windows?

emedvedev commented 5 years ago

You’d have to change the model for that, as the windows represent the attention part of the model. You could also probably adjust the drawing method, but then the representation wouldn’t be accurate anymore, so I’m not really sure why you’d want to do that. On Jul 22, 2019, 16:45 +0700, githubpiyush notifications@github.com, wrote:

The gif's you have included as output, can we change the width of those sliding windows? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

githubpiyush commented 5 years ago

How is the width is decided? Like which part of the code? Is it learning from previous input's attention or what? Thanks for the response

githubpiyush commented 5 years ago

The Drawing method is "visualize_attention" in model.py file?

I want to increase the width because if i can increase the width then characters in my input image can be recognized properly.

Edit: How to increase the attention mask size?