TandoorRecipes / recipes

Application for managing recipes, planning meals, building shopping lists and much much more!
https://docs.tandoor.dev
Other
5.6k stars 597 forks source link

Image import / parsing #1308

Open FL550 opened 2 years ago

FL550 commented 2 years ago

First of all, thanks for the wonderful project!

I would love to have a automatic parsing of images of recipes. Out of cooking magazines for example.

Most of the recipes follow a more or less similar layout, so I think it could be possible to parse them via OCR.

I haven't found anything, if you already considered this in the past. But if not I could assist with the implementation.

smilerz commented 2 years ago

OCR has long been part of our roadmap, but hasn't floated to the top yet. If you have the skills to help it would be amazing!

MarcusWolschon commented 2 years ago

If the cooking magazine has a web version, then it's likely to have json+ld embedded in the web version of a receipt. Images from there are imported. Doesn't fix this issue but it may be a workaround for the imediate need.

FL550 commented 2 years ago

Thanks for the hint, but this is not always a possible alternative. And manually copying the recipe is nothing I like to do often šŸ˜…

I'll have a deeper look into this idea the next week's. Have you also considered PDF parsing? I think I could reuse some parts of the code as the parsing to a recipe should be a similar task in both cases.

smilerz commented 2 years ago

I've attempted it a couple times as both are desired capabilties - but they were hard so I worked on something easier. šŸ˜…

FL550 commented 2 years ago

Ok, I'll try my luck :)

Vohwinkelh commented 2 years ago

Like the idea a lot. Not really skilled in the art, but happy to help i.e. testing

pmeyerson commented 2 years ago

Hi, I just started looking at this project and am interested in ocr capability too. I just did a demo for myself with pytesseract and an image from a cookbook I have, using the pytorch libraries that use cpu instead of gpu/cuda.

I got pretty good results from my cookbook, but some things did not come across well (fractions, numbers for ingredients separated from ingredient names, etc.). I think we'd need to present the text to the user so they can validate and make any needed corrections.

I'd be interested in contributing but wanted to make sure project maintainers are OK with adding ocr libraries and models to project footprint

smilerz commented 2 years ago

Feel free to give it a shot. But Iā€™m a little skeptical it will ever be useable. The OCR capabilities in smart phone cameras are far superior. It may make more sense to make the pipeline from camera to tandoor easier to use than implementing crappy open source OCR.

pmeyerson commented 2 years ago

Oh, interesting. I'd never seen that before but just checked it out and worked really well :) Glad you mentioned it.

pmeyerson commented 2 years ago

something to parse out the ingredient /amount from the raw text would be nice too i think.

vabene1111 commented 2 years ago

there is already a somewhat stable pipeline to throw data at tandoor and run some parsers against it to make a recipe out of it so the main thing that needs to be solved is first OCR on the document so the text is recognized properly and the basic labeling of the data (e.g. what are ingredients, instructions, images, ...). The first one is likely smartphone camera the second one likely a step in the importer workflow.

As smilerz said, feel free to play around with it, i think this would be a very powerful feature, we just did not have time for it yet.

bibi2k commented 11 months ago

I discovered Tandoor this evening, and I think it would be the definitive alternative to Cookbook on Google Play if it had the OCR feature to import recipe from images. Is there a chance we get it in the future?

smilerz commented 11 months ago

I discovered Tandoor this evening, and I think it would be the definitive alternative to Cookbook on Google Play if it had the OCR feature to import recipe from images. Is there a chance we get it in the future?

Open source OCR is pretty bad. There is a related project that utilizes OpenAI to generate a JSON doc from a picture. It's possible that or something similar will be integrated, but it's not on the near term roadmap.

Felix2M commented 6 months ago

Would it be feasable to integrate the OCR function of Microsoft Powertoys (it is not perfect but quite good)? So e.g. you can upload an image or a screenshot you can select areas for the main image and step images and areas for ingredients, and texts of the individual steps. Could be something like a cursor that jumps (similiar to tab key) after a selected item of the editing page to the next one and you only need to mark that area on the image.

smilerz commented 6 months ago

Would it be feasable to integrate the OCR function of Microsoft Powertoys (it is not perfect but quite good)? So e.g. you can upload an image or a screenshot you can select areas for the main image and step images and areas for ingredients, and texts of the individual steps. Could be something like a cursor that jumps (similiar to tab key) after a selected item of the editing page to the next one and you only need to mark that area on the image.

No, we are not likely to implement platform specific features.