Open ItsNickBarry opened 4 years ago
I think this would work well for simple recipes, but how would you handle e.g.
To make a butter dough, melt butter, then glaze the butter dough with butter.
Of course, this is slightly exaggerated, but situations like these could occur.
I think for simple repetitions of the ingredient, it would be fine to repeat the quantity. Maybe the first insertion into the directions would be bold 100% black, while subsequent insertions would be bold 80% black.
The case where the ingredient name is included in an unrelated noun phrase ("butter dough") is more complex, but a neural network might be able to recognize that it's not an ingredient based on preceding verbs (the conjugation of "melt" as opposed to the unconjugated "to make"). I think false positives would be inevitable, however.
You wouldn’t need a neural network for this, a simple bigram classifier would probably suffice. This is quite easy to implement with nltk
.
On 18 Mar 2020, at 21:00, Nick Barry notifications@github.com wrote:
I think for simple repetitions of the ingredient, it would be fine to repeat the quantity. Maybe the first insertion into the directions would be bold 100% black, while subsequent insertions would be bold 80% black.
The case where the ingredient name is included in an unrelated noun phrase ("butter dough") is more complex, but a neural network might be able to recognize that it's not an ingredient based on preceding verbs (the conjugation of "melt" as opposed to the unconjugated "to make"). I think false positives would be inevitable, however.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/open-source-ideas/open-source-ideas/issues/229#issuecomment-600831143, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2V2G7RKRNHX3MYNFLU3MLRIER6TANCNFSM4LN62JPQ.
You wouldn’t need a neural network for this, a simple bigram classifier would probably suffice. This is quite easy to implement with
nltk
.
A simple bigram classifier might work fine, but using a fine tuned Bert model would probably offer better flexibility for weird cases like the ones mentioned above, the only problem would be to come up with a good dataset..
Absolutely. Good point @raff7.
Project description
Online recipes tend to be badly formatted. An aspiring chef, struggling through a recipe's process, is frequently forced to return to the ingredient list for quantity information, wasting valuable time, burning countless meals, increasing food insecurity by raising staple crop prices, and exacerbating runaway climate change.
Recipe publishers are entrenched in their ways, and must be stopped. The solution is to reformat recipes using a browser extension, such that ingredient quantities are included inline along with the directions.
For example, consider this abomination:
Reformatted, the direction section would look like this:
The unit system (metric/imperial) of the output should be configurable, and recipe multiplication (doubling, etc.) should be supported.
Relevant Technology
Ideally this would be achieved using a pre-trained machine learning system so that it could work on various sites and site layouts, but a simpler heuristic system might be sufficient.
Complexity and required time
Complexity
Required time (ETA)
Categories