mrousavy / react-native-vision-camera

📸 A powerful, high-performance React Native Camera library.
https://react-native-vision-camera.com
MIT License
7.25k stars 1.06k forks source link

🐛 Issues with MlKit TextRecognition on iOS #1090

Closed sumi-svmx closed 11 months ago

sumi-svmx commented 2 years ago

What were you trying to do?

Had tried this with vision-camera-ocr using GoogleMLKit/TextRecognition 2.2.0

Does not seem to recognize text with the frame buffer. Made the following changes as a fix (not ideal). Wondering if there could be a fix to this, since issue seems to come with the kind of format of the frame buffer with this library.

(iOS version 14.6)

Reproduceable Code

let imageBuffer = CMSampleBufferGetImageBuffer(frame.buffer)!
let ciimage = CIImage(cvPixelBuffer: imageBuffer)   
let context = CIContext(options: nil)
let cgImage = context.createCGImage(ciimage, from: ciimage.extent)!
let image = UIImage(cgImage: cgImage)
let visionImage = VisionImage(image: image)
visionImage.orientation = .up
...
 result = try TextRecognizer.textRecognizer()
            .results(in: visionImage)

What happened instead?

Does not recognize text, returns empty.

Relevant log output

VisionCameraOcrExample[1351:9748767] Initialized TensorFlow Lite runtime.
INFO: Initialized TensorFlow Lite runtime.
VisionCameraOcrExample[1351:9748771] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
VisionCameraOcrExample[1351:9748765] [native] VisionCamera.invokeOnFrameProcessorPerformanceSuggestionAvailable(currentFps:suggestedFps:): suggestedFps 0.0...
VisionCameraOcrExample[1351:9748765] [native] VisionCamera.invokeOnFrameProcessorPerformanceSuggestionAvailable(currentFps:suggestedFps:): currentFps 5.0...
VisionCameraOcrExample[1351:9748765] [native] VisionCamera.invokeOnFrameProcessorPerformanceSuggestionAvailable(currentFps:suggestedFps:): Frame Processor Performance Suggestion available!
VisionCameraOcrExample[1351:9749054] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
VisionCameraOcrExample[1351:9749054] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
VisionCameraOcrExample[1351:9749054] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
VisionCameraOcrExample[1351:9749054] [native] VisionCamera.captureOutput(_:didOutput:from:): The Frame Processor took so long to execute that a frame was dropped.
VisionCameraOcrExample[1351:9749054]

Device

iPhone11 (iOS 14.6)

VisionCamera Version

2.13.3

Additional information

mrousavy commented 2 years ago
  1. Does it work if you rotate the phone in a different direction? If yes, it's a bug from me.
  2. Does it work if you chose a format that has a different pixel format? Remember that the TextRecognizer you're using might not be compatible with all formats, see this: https://github.com/mrousavy/react-native-vision-camera/blob/fb2156ec397f573fc7beac56de6b8975f5831500/src/CameraDevice.ts#L175-L180 If yes, it's not VisionCamera's fault.
sumi-svmx commented 2 years ago

@mrousavy Thanks for the response

  1. No, doesn't work if rotated in different direction either
  2. Tried using preset vga-640x480 with MLKit/TextRecognition 2.2.0 - doesn't work. Seems to work with vga-640x480 preset in MLKit/TextRecognition 3.0.0. But it's quite a low quality on the camera, and many texts get recognized wrong. So as a workaround used this instead, which isn't ideal
let imageBuffer = CMSampleBufferGetImageBuffer(frame.buffer)!
let ciimage = CIImage(cvPixelBuffer: imageBuffer)   
let context = CIContext(options: nil)
let cgImage = context.createCGImage(ciimage, from: ciimage.extent)!
let image = UIImage(cgImage: cgImage)
let visionImage = VisionImage(image: image)
visionImage.orientation = .up
...
 result = try TextRecognizer.textRecognizer()
            .results(in: visionImage)
tadjik1 commented 2 years ago

@sumi-svmx hey, I have the same issue (but with object detector), did you solve yours? My code looks very similar, but I'm getting the image object slightly different, what is the reason to use CIImage as proxy?

@objc
public static func callback(_ frame: Frame!, withArgs _: [Any]!) -> Any! {
  let image = VisionImage(buffer: frame.buffer)
  image.orientation = .upMirrored

  do {
    let objects : [Object] = try detector.results(in: image)
    if (objects.isEmpty) {
      print("can not find object")
      return nil
    }

    ...
jbuijgers commented 2 years ago

Can confirm this solved my problem with getting MLKit/TextRecognition 3.1.0 to work. Thanks @sumi-svmx!

mrousavy commented 1 year ago

See https://github.com/tensorflow/tfjs/issues/7773 :)

mrousavy commented 11 months ago

Hey - I created a new fast JSI library for this that is written in C++ and uses GPU acceleration: https://github.com/mrousavy/react-native-fast-tflite 🥳

nqam1904 commented 5 months ago

this issue resolve yet :-?