open-source-ideas / ideas

💡 Looking for inspiration for your next open source project? Or perhaps you've got a brilliant idea you can't wait to share with others? Open Source Ideas is a community built specifically for this! 👋
6.57k stars 221 forks source link

Online recipes considered harmful #229

Open ItsNickBarry opened 4 years ago

ItsNickBarry commented 4 years ago

Project description

Online recipes tend to be badly formatted. An aspiring chef, struggling through a recipe's process, is frequently forced to return to the ingredient list for quantity information, wasting valuable time, burning countless meals, increasing food insecurity by raising staple crop prices, and exacerbating runaway climate change.

Recipe publishers are entrenched in their ways, and must be stopped. The solution is to reformat recipes using a browser extension, such that ingredient quantities are included inline along with the directions.

For example, consider this abomination:

Ingredients: 3 ounces unsalted butter (6 tablespoons; 85g) 3/4 ounce sugar (4 teaspoons; 20g) 1 teaspoon (4g) Diamond Crystal kosher salt (for table salt, use half as much by volume or use the same weight)

Directions: Melt butter in a 3-quart stainless steel saucier or saucepan, stirring and scraping with a heat-resistant spatula as it bubbles, and cook until golden brown. Remove from heat and immediately stir in sugar, salt.

Reformatted, the direction section would look like this:

Directions: Melt butter (3 oz) in a 3-quart stainless steel saucier or saucepan, stirring and scraping with a heat-resistant spatula as it bubbles, and cook until golden brown. Remove from heat and immediately stir in sugar (3/4 oz), salt (1 tsp).

The unit system (metric/imperial) of the output should be configurable, and recipe multiplication (doubling, etc.) should be supported.

Relevant Technology

Ideally this would be achieved using a pre-trained machine learning system so that it could work on various sites and site layouts, but a simpler heuristic system might be sufficient.

Complexity and required time

Complexity

Required time (ETA)

Categories

FredrikAugust commented 4 years ago

I think this would work well for simple recipes, but how would you handle e.g.

To make a butter dough, melt butter, then glaze the butter dough with butter.

Of course, this is slightly exaggerated, but situations like these could occur.

ItsNickBarry commented 4 years ago

I think for simple repetitions of the ingredient, it would be fine to repeat the quantity. Maybe the first insertion into the directions would be bold 100% black, while subsequent insertions would be bold 80% black.

The case where the ingredient name is included in an unrelated noun phrase ("butter dough") is more complex, but a neural network might be able to recognize that it's not an ingredient based on preceding verbs (the conjugation of "melt" as opposed to the unconjugated "to make"). I think false positives would be inevitable, however.

FredrikAugust commented 4 years ago

You wouldn’t need a neural network for this, a simple bigram classifier would probably suffice. This is quite easy to implement with nltk.

On 18 Mar 2020, at 21:00, Nick Barry notifications@github.com wrote:

I think for simple repetitions of the ingredient, it would be fine to repeat the quantity. Maybe the first insertion into the directions would be bold 100% black, while subsequent insertions would be bold 80% black.

The case where the ingredient name is included in an unrelated noun phrase ("butter dough") is more complex, but a neural network might be able to recognize that it's not an ingredient based on preceding verbs (the conjugation of "melt" as opposed to the unconjugated "to make"). I think false positives would be inevitable, however.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/open-source-ideas/open-source-ideas/issues/229#issuecomment-600831143, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2V2G7RKRNHX3MYNFLU3MLRIER6TANCNFSM4LN62JPQ.

raff7 commented 4 years ago

You wouldn’t need a neural network for this, a simple bigram classifier would probably suffice. This is quite easy to implement with nltk.

A simple bigram classifier might work fine, but using a fine tuned Bert model would probably offer better flexibility for weird cases like the ones mentioned above, the only problem would be to come up with a good dataset..

FredrikAugust commented 4 years ago

Absolutely. Good point @raff7.