Online recipes considered harmful

ItsNickBarry commented 4 years ago

Project description

Online recipes tend to be badly formatted. An aspiring chef, struggling through a recipe's process, is frequently forced to return to the ingredient list for quantity information, wasting valuable time, burning countless meals, increasing food insecurity by raising staple crop prices, and exacerbating runaway climate change.

Recipe publishers are entrenched in their ways, and must be stopped. The solution is to reformat recipes using a browser extension, such that ingredient quantities are included inline along with the directions.

For example, consider this abomination:

Ingredients: 3 ounces unsalted butter (6 tablespoons; 85g) 3/4 ounce sugar (4 teaspoons; 20g) 1 teaspoon (4g) Diamond Crystal kosher salt (for table salt, use half as much by volume or use the same weight)

Directions: Melt butter in a 3-quart stainless steel saucier or saucepan, stirring and scraping with a heat-resistant spatula as it bubbles, and cook until golden brown. Remove from heat and immediately stir in sugar, salt.

Reformatted, the direction section would look like this:

Directions: Melt butter (3 oz) in a 3-quart stainless steel saucier or saucepan, stirring and scraping with a heat-resistant spatula as it bubbles, and cook until golden brown. Remove from heat and immediately stir in sugar (3/4 oz), salt (1 tsp).

The unit system (metric/imperial) of the output should be configurable, and recipe multiplication (doubling, etc.) should be supported.

Relevant Technology

Ideally this would be achieved using a pre-trained machine learning system so that it could work on various sites and site layouts, but a simpler heuristic system might be sufficient.

Complexity and required time

Complexity

[x] Beginner - This project requires no or little prior knowledge of the technolog(y|ies) specified to contribute to the project
[ ] Intermediate - The user should have some prior knowledge of the technolog(y|ies) to the point where they know how to use it, but not necessarily all the nooks and crannies of the technology
[ ] Advanced - The project requires the user to have a good understanding of all components of the project to contribute

Required time (ETA)

[ ] Little work - A couple of days
[x] Medium work - A week or two
[ ] Much work - The project will take more than a couple of weeks and serious planning is required

FredrikAugust commented 4 years ago

I think this would work well for simple recipes, but how would you handle e.g.

To make a butter dough, melt butter, then glaze the butter dough with butter.

Of course, this is slightly exaggerated, but situations like these could occur.

ItsNickBarry commented 4 years ago

I think for simple repetitions of the ingredient, it would be fine to repeat the quantity. Maybe the first insertion into the directions would be bold 100% black, while subsequent insertions would be bold 80% black.

The case where the ingredient name is included in an unrelated noun phrase ("butter dough") is more complex, but a neural network might be able to recognize that it's not an ingredient based on preceding verbs (the conjugation of "melt" as opposed to the unconjugated "to make"). I think false positives would be inevitable, however.

FredrikAugust commented 4 years ago

You wouldn’t need a neural network for this, a simple bigram classifier would probably suffice. This is quite easy to implement with nltk.

On 18 Mar 2020, at 21:00, Nick Barry notifications@github.com wrote:

I think for simple repetitions of the ingredient, it would be fine to repeat the quantity. Maybe the first insertion into the directions would be bold 100% black, while subsequent insertions would be bold 80% black.

The case where the ingredient name is included in an unrelated noun phrase ("butter dough") is more complex, but a neural network might be able to recognize that it's not an ingredient based on preceding verbs (the conjugation of "melt" as opposed to the unconjugated "to make"). I think false positives would be inevitable, however.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/open-source-ideas/open-source-ideas/issues/229#issuecomment-600831143, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2V2G7RKRNHX3MYNFLU3MLRIER6TANCNFSM4LN62JPQ.

raff7 commented 4 years ago

You wouldn’t need a neural network for this, a simple bigram classifier would probably suffice. This is quite easy to implement with nltk.

A simple bigram classifier might work fine, but using a fine tuned Bert model would probably offer better flexibility for weird cases like the ones mentioned above, the only problem would be to come up with a good dataset..

FredrikAugust commented 4 years ago

Absolutely. Good point @raff7.

open-source-ideas / ideas