Open FL550 opened 2 years ago
OCR has long been part of our roadmap, but hasn't floated to the top yet. If you have the skills to help it would be amazing!
If the cooking magazine has a web version, then it's likely to have json+ld embedded in the web version of a receipt. Images from there are imported. Doesn't fix this issue but it may be a workaround for the imediate need.
Thanks for the hint, but this is not always a possible alternative. And manually copying the recipe is nothing I like to do often š
I'll have a deeper look into this idea the next week's. Have you also considered PDF parsing? I think I could reuse some parts of the code as the parsing to a recipe should be a similar task in both cases.
I've attempted it a couple times as both are desired capabilties - but they were hard so I worked on something easier. š
Ok, I'll try my luck :)
Like the idea a lot. Not really skilled in the art, but happy to help i.e. testing
Hi, I just started looking at this project and am interested in ocr capability too. I just did a demo for myself with pytesseract and an image from a cookbook I have, using the pytorch libraries that use cpu instead of gpu/cuda.
I got pretty good results from my cookbook, but some things did not come across well (fractions, numbers for ingredients separated from ingredient names, etc.). I think we'd need to present the text to the user so they can validate and make any needed corrections.
I'd be interested in contributing but wanted to make sure project maintainers are OK with adding ocr libraries and models to project footprint
Feel free to give it a shot. But Iām a little skeptical it will ever be useable. The OCR capabilities in smart phone cameras are far superior. It may make more sense to make the pipeline from camera to tandoor easier to use than implementing crappy open source OCR.
Oh, interesting. I'd never seen that before but just checked it out and worked really well :) Glad you mentioned it.
something to parse out the ingredient /amount from the raw text would be nice too i think.
there is already a somewhat stable pipeline to throw data at tandoor and run some parsers against it to make a recipe out of it so the main thing that needs to be solved is first OCR on the document so the text is recognized properly and the basic labeling of the data (e.g. what are ingredients, instructions, images, ...). The first one is likely smartphone camera the second one likely a step in the importer workflow.
As smilerz said, feel free to play around with it, i think this would be a very powerful feature, we just did not have time for it yet.
I discovered Tandoor this evening, and I think it would be the definitive alternative to Cookbook on Google Play if it had the OCR feature to import recipe from images. Is there a chance we get it in the future?
I discovered Tandoor this evening, and I think it would be the definitive alternative to Cookbook on Google Play if it had the OCR feature to import recipe from images. Is there a chance we get it in the future?
Open source OCR is pretty bad. There is a related project that utilizes OpenAI to generate a JSON doc from a picture. It's possible that or something similar will be integrated, but it's not on the near term roadmap.
Would it be feasable to integrate the OCR function of Microsoft Powertoys (it is not perfect but quite good)? So e.g. you can upload an image or a screenshot you can select areas for the main image and step images and areas for ingredients, and texts of the individual steps. Could be something like a cursor that jumps (similiar to tab key) after a selected item of the editing page to the next one and you only need to mark that area on the image.
Would it be feasable to integrate the OCR function of Microsoft Powertoys (it is not perfect but quite good)? So e.g. you can upload an image or a screenshot you can select areas for the main image and step images and areas for ingredients, and texts of the individual steps. Could be something like a cursor that jumps (similiar to tab key) after a selected item of the editing page to the next one and you only need to mark that area on the image.
No, we are not likely to implement platform specific features.
First of all, thanks for the wonderful project!
I would love to have a automatic parsing of images of recipes. Out of cooking magazines for example.
Most of the recipes follow a more or less similar layout, so I think it could be possible to parse them via OCR.
I haven't found anything, if you already considered this in the past. But if not I could assist with the implementation.