UB-Mannheim / ocr-fileformat

Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
https://digi.bib.uni-mannheim.de/ocr-fileformat/
MIT License
178 stars 23 forks source link

Microsoft Computer Vision API #58

Open stweil opened 7 years ago

stweil commented 7 years ago

Microsoft offers OCR services in the Office Products and online, see https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/. The online service supports text or JSON output. If needed, we could add support for the JSON format which looks quite simple.

prashantguleria commented 4 years ago

HI is there any way to convert computer vision OCR output to hOCR?

zuphilip commented 4 years ago

Do you have any test data as JSON from Computer Vision OCR you can share here?

prashantguleria commented 4 years ago

Do you have any test data as JSON from Computer Vision OCR you can share here?

test

ocr_response.txt

Attached image and it's ocr response.

kba commented 4 years ago

Maybe @dinosauria123 has a take on this, since he's been developing a converter from Google's Cloud Vision API responses at https://github.com/dinosauria123/gcv2hocr