Open mblennegard opened 2 weeks ago
The interpretation to the recipe schema is done by parsing what is returned by the Azure API: https://github.com/reaper47/recipya/blob/main/internal%2Fmodels%2Focr.go#L87
Thank you for linking this package. I will think about how to approach this once v1.2.0 stable is released.
https://github.com/tiagomelo/go-ocr is simple , maybe too simple.
it calls tesseract via an exe call, so no cgo bindings.
Is your feature request related to a problem? Please describe. Instead of having to rely on a cloud service, e.g. using Azure AI Document Intelligence in the current state, it would be very neat being able to provide other/additional OCR engines. For instance if wanting to keep everything hosted locally (e.g. https://github.com/ocrmypdf/OCRmyPDF seems like a nice option), or if the results are bad from one OCR engine then another could be used for a particular recipe upload.
Describe the solution you'd like Ideally being able to add multiple OCR engines, which can then be chosen from upon upload. After upload completes, re-send the image to a different OCR engine if the results are not good from the first attempt.
Describe alternatives you've considered How does the Scan feature actually work, e.g. if I have already manually OCR:ed an image (I have done this a lot over the years scanning entire cookbooks which I then run OCR for to make them searchable)? Is the interpretation to a recipe schema happening inside Recipya or within Azure AI Document Intelligence? If it is happening inside Recipya then this is of course a much easier thing to implement.