konstantint / PassportEye

Extraction of machine-readable zone information from passports, visas and id-cards via OCR
MIT License
372 stars 109 forks source link

Realtime Video Passport recognition #7

Closed isantolin closed 5 years ago

isantolin commented 6 years ago

@konstantint do you have a code example for realtime video?

konstantint commented 6 years ago

This particular implementation is not incremental and too slow for realtime video (as mentioned in the readme, there are examples where it takes tens of seconds to parse the document from the picture, because it tries several various rescalings and rotations searching for OCR-able text in a somewhat brute-force manner).

For realtime you'd need to get rid of this brute force step and find a method to recover a well-aligned MRZ from the image in one pass. One possibility is to train an object tracking model (e.g. something convnet-based), however for that you need training data (and ample time for experimentation).

Even then it might not be perfectly realtime, because the tesseract invocation probably usually takes more than 40ms, to overcome that you'd either need to somehow cache previously computed values or come up with an "incremental OCR", which is a bit of a research project of its own.