Closed dcarpenter31 closed 3 years ago
TensorFlow has an existing OCR model that is documented on this page. It does mention that it isn't the best model for OCR in the wild/in low light but maybe we can use this as a springboard for a different model.
We might also want to look into the OpenCV text detection showcased here.
OpenCV’s EAST text detector is a deep learning model, based on a novel architecture and training pattern. It is capable of (1) running at near real-time at 13 FPS on 720p images and (2) obtains state-of-the-art text detection accuracy.
It seems as if this could be a good solution to this issue, however, more research should be done on this. This is a very complex topic after all.
I found that page too. There's a catch:
The models are not general enough for OCR in the wild (say, random images taken by a smartphone camera in a low lighting condition).
This might be usable if we gathered some training data and retrained it, but then we might have to label training data and thats tedious
It does mention that it isn't the best model for OCR in the wild/in low light but maybe we can use this as a springboard for a different model.
I did see that too, but we would be faced with the same issue if we were to create our own model anyway. I was more intrigued by the OpenCV text detection because it can run in real-time. Maybe instead of Tensorflow, we can use that.
EDIT: This article (it is on medium so there might be a paywall) has some datasets that contain data for text recognition.
I have a feeling that it's gonna be tough to get opencv into react native. I'm gonna go look for potential datasets, there's probably something freely available.
I just edited my above comment with 2 datasets containing data (>1800 images) that may be able to be used.
That first one with Cornell looks very promising.
Yeah, I'm downloading the COCO dataset right now. mscoco.org seems to be down, but I found a mirror. Gonna look into hosting the dataset somewhere easier to access/more reliable.
Edit: Actually, looks like they moved their website to https://cocodataset.org/
Some new resources: https://stackoverflow.com/questions/50344844/tensorflow-js-for-ocr https://github.com/tensorflow/models/tree/master/research/attention_ocr
Attention OCR looks customizable so we can use the COCO dataset that was referenced above.
Since we decided to abandon doing this on device, can we close this?
Closed since we have moved ML processing to a remote server
A TensorFlow model for identifying text in photos will need to be created to be utilized by the app.