waldo-seg / waldo

image-segmentation and text-localization
Apache License 2.0
13 stars 13 forks source link

Madcat_arabic: Adding function to view ground truth and predicted BBox on original image #74

Closed ChunChiehChang closed 6 years ago

ChunChiehChang commented 6 years ago

Example of the output is shown below. The only problem is that the scripts are now required to save the original image and dimensions. This causes ./local/process_data.py to be slower. @aarora8 aaw_arb_20070104 0028_1_ldc0088_orig

danpovey commented 6 years ago

That looks great! Can you make it optional? Or just run for a few images by default, maybe, and option to do them all?

Incidentally (and it's great that you did this, or I would not have noticed the issue), I notice that some of the little horizontal lines that indicate the letter (I think these are a short form of pairs of dots), are just outside the bounding boxes for their text lines. This will likely cause misrecognition. I think what we should do is set up the script that processes the data to add a little to the bounding boxes before extracting the features-- for instance (and this could be an option at the script level, it could add a boundary of 0.1 times (the height of the text line) to the top, bottom and sides of the text line.

If the resulting boxes were partly outside the image, we'd deal with it by padding or reflection or something.

Dan

On Thu, Jun 7, 2018 at 5:22 PM, ChunChiehChang notifications@github.com wrote:

Example of the output is shown below. The only problem is that the scripts are now required to save the original image and dimensions. This causes ./local/process_data.py to be slower. @aarora8 https://github.com/aarora8 [image: aaw_arb_20070104 0028_1_ldc0088_orig] https://user-images.githubusercontent.com/28868330/41126852-f0746614-6a76-11e8-8570-191cf2606d12.png

You can view, comment on, or merge this pull request online at:

https://github.com/waldo-seg/waldo/pull/74 Commit Summary

  • saving original dim
  • adding visualization of bbox on orignal image
  • Merge remote-tracking branch 'upstream/master' into madcat_ar
  • fixed minor bugs

File Changes

Patch Links:

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/waldo-seg/waldo/pull/74, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu5u2cN-K5cBO67RQ8ofcBrSlwLBJks5t6ZmugaJpZM4UfJNQ .

danpovey commented 6 years ago

... although that other stuff, obviously, is a Kaldi issue, not a Waldo issue.

On Thu, Jun 7, 2018 at 5:28 PM, Daniel Povey dpovey@gmail.com wrote:

That looks great! Can you make it optional? Or just run for a few images by default, maybe, and option to do them all?

Incidentally (and it's great that you did this, or I would not have noticed the issue), I notice that some of the little horizontal lines that indicate the letter (I think these are a short form of pairs of dots), are just outside the bounding boxes for their text lines. This will likely cause misrecognition. I think what we should do is set up the script that processes the data to add a little to the bounding boxes before extracting the features-- for instance (and this could be an option at the script level, it could add a boundary of 0.1 times (the height of the text line) to the top, bottom and sides of the text line.

If the resulting boxes were partly outside the image, we'd deal with it by padding or reflection or something.

Dan

On Thu, Jun 7, 2018 at 5:22 PM, ChunChiehChang notifications@github.com wrote:

Example of the output is shown below. The only problem is that the scripts are now required to save the original image and dimensions. This causes ./local/process_data.py to be slower. @aarora8 https://github.com/aarora8 [image: aaw_arb_20070104 0028_1_ldc0088_orig] https://user-images.githubusercontent.com/28868330/41126852-f0746614-6a76-11e8-8570-191cf2606d12.png

You can view, comment on, or merge this pull request online at:

https://github.com/waldo-seg/waldo/pull/74 Commit Summary

  • saving original dim
  • adding visualization of bbox on orignal image
  • Merge remote-tracking branch 'upstream/master' into madcat_ar
  • fixed minor bugs

File Changes

Patch Links:

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/waldo-seg/waldo/pull/74, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu5u2cN-K5cBO67RQ8ofcBrSlwLBJks5t6ZmugaJpZM4UfJNQ .

aarora8 commented 6 years ago

It is looking very good/informative.

hhadian commented 6 years ago

This looks very nice! I opened the image in a new window and it's much bigger than my screen. I guess it's better if you could scale it down to a more reasonable size (e.g. height=1000)

ChunChiehChang commented 6 years ago

The image size currently in use is the original size of the image. I can scale the size of the image down to something smaller.

danpovey commented 6 years ago

Let us know if you think it's ready to merge.