googlesamples / mlkit

A collection of sample apps to demonstrate how to use Google's ML Kit APIs on Android and iOS
Apache License 2.0
3.45k stars 2.9k forks source link

[Bug report] corner points of text elements are erroneous because they lie far away from the text element's bounding box #671

Open KnollFrank opened 1 year ago

KnollFrank commented 1 year ago

Describe the bug com.google.mlkit.vision.text.Text.Element.getCornerPoints() provides corner points which have nothing to do with the bounding box obtained via com.google.mlkit.vision.text.Text.Element.getBoundingBox().

To Reproduce Example Steps to reproduce the behavior in sample app:

  1. Open the MLKit-Vision app.
  2. Click on "Run the ML Kit quickstart written in Java"
  3. Select "CameraXLivePreviewActivity"
  4. Select "Text Recognition Latin"
  5. Scan some text
  6. Look at the Logcat output:
    Element text is: Der
    TextGraphic com.google.mlkit.vision.demo D Element boundingbox is: Rect(56, 28 - 123, 68)  
    TextGraphic com.google.mlkit.vision.demo D Element cornerpoint is: [Point(28, 421), Point(33, 357), Point(68, 360), Point(63, 424)]

    Expected behavior The corner points [Point(28, 421), Point(33, 357), Point(68, 360), Point(63, 424)] lie far away and outside of the bounding box Rect(56, 28 - 123, 68), i.e. the corner points have nothing to do with the bounding box. The expected behaviour is that the corner points instead lie near the bounding box. It seems that the x and y coordinates of the corner points need to be swapped, but this alone wouldn't fix the problem.

SDK Info:

vKapilCellid commented 11 months ago

Facing the same issue, any updates on this?

stunningdimension22 commented 11 months ago

Can you try with the latest version and also share an image that can reproduce?

KnollFrank commented 11 months ago

The problem remains using com.google.android.gms:play-services-mlkit-text-recognition:19.0.0. You may use this image: all_insects

vKapilCellid commented 11 months ago

I can also confirm that the problem remains in com.google.android.gms:play-services-mlkit-text-recognition:19.0.0. Example, for the image -

image

Output is -

Shift
Bounding Box - 
352 178 370 227
Corner Points - 
[Point(178, 110), Point(227, 110), Point(227, 128), Point(178, 128)]
PgDn
Bounding Box - 
236 173 256 222
Corner Points - 
[Point(173, 224), Point(222, 224), Point(222, 244), Point(173, 244)]
ArchDevil commented 9 months ago

This also applies to Symbol and its getBoundingBox() and getCornerPoints() methods. The reported bounding box and corner points occasionally lie outside the bounds of their encompassing Line, Block or even InputImage. (Visual inspection reveals that the Line getBoundingBox() and getCornerPoints() is correct.)

Sauvio commented 6 months ago

ATEST Resolution: 220x112

The issue also exists in com.google.mlkit:text-recognition-chinese:16.0.0, where the bounding boxes and corner points may exceed the bounds of their encompassing Line, Block, or even InputImage—especially when testing with smaller images, as in the example above. Logs -

TextBlock text is: 10:35.
TextBlock boundingbox is: Rect(31, 19 - 226, 95) 
TextBlock cornerpoint is: [Point(31, 22), Point(226, 19), Point(226, 92), Point(31, 95)]
Line boundingbox is: Rect(31, 19 - 226, 95)
Line cornerpoint is: [Point(31, 22), Point(226, 19), Point(226, 92), Point(31, 95)]
Element boundingbox is: Rect(31, 19 - 226, 95)
Element cornerpoint is: [Point(31, 22), Point(226, 19), Point(226, 92), Point(31, 95)]
Symbol text is: 1
Symbol boundingbox is: Rect(31, 22 - 70, 95)
Symbol cornerpoint is: [Point(31, 23), Point(70, 22), Point(70, 94), Point(31, 95)]