Closed joestump closed 3 months ago
The OCR being returned on any photos of a physical receipt is pretty bad:
oie Py aa oe e Lily Cla ‘ yt Ne ’ od a YAS rh ’ 25. Layey a ‘ Doe SS aOR ee RSE SRC ie ea Ce ges 02 sabe Bhat 2 7 Nigh ANA ope nets Oe. eles SEES uaa tat rwury fa sae . RA Pa Ss ee é ¢ J WS RITA Bea Bg nore Tea re Swe Mea Se Se ok 8 ae REE GR ee Aaa ae
When I try the EasyOCR I get: Receipt data from AI: signal: illegal instruction (core dumped)
Would prefer to just let OpenAI read the text in the image as well as extracting information from the image.
I have been experimenting with the viability of using vision models/multi modal models for Receipt Wrangler recently. It is definitely a lot easier than using OCR, and the whole process is much faster too.
I'll mark this as a feature request, assuming it works well enough it will get implemented in Receipt Wrangler. I'll also check out what's going on with EasyOCR.
For using OCR in general, the photos need to be as clear as they can possibly be for good results. Tesseract in particular works the best on black and white images, with minimal receipt wrinkles and all of that. Receipt Wrangler does pre process every image before performing OCR, such as de-skewing (straightening image), converting to black and white and removing noise, so the input image doesn't have to be absolutely perfect. But better quality still helps.
Support for OpenAI Vision has been added, make sure to update your container(s) first. Check out https://receiptwrangler.io/docs/concepts/system-settings/receipt-processing-settings#managing-open-aigemini-receipt-processing-settings to learn more.
Just need to pick the model you would like to use in the Receipt Processing Settings, either gpt-4o or gpt-4o-mini, then check the "Use Vision?" Checkbox and you'll be good to go.
Is it possible to skip OCR entirely and just use the file uploaded directly with OpenAI?