Closed jayhale closed 1 year ago
I would argue that this is important information to the recipe. It relates the ingredient list to the instructions.
@ggilley agreed. Perhaps this would be better represented as groups of ingredients, since that is the semantic intent. However, I expect the method .ingredients()
to return only ingredients (e.g., can be readily used for questions like "How many ingredients does this recipe have?").
Agreed that if the site has formatting information that separates the informational lines, it should be captured here. However, in general deciding that an ingredient line is an ingredient or informational is a hard problem and I don't think it belongs in the scraper.
A simple way of dealing with them could be to prefix the informational line with a prefix like '# '. That way the ordering is preserved and you have an indicator of the special nature of the line.
This sounds similar to #301 - the indicated ingredient lines are groupings.
The NIH scraper includes experimental support for an IngredientGroup
dataclass - could that be relevant here too?
However, in general deciding that an ingredient line is an ingredient or informational is a hard problem and I don't think it belongs in the scraper. 💯
In this version of the package as well as in the future, .ingredients() will behave as it is now.
On a side note, pip install recipe-scrapers[extras]
version of the packages is evaluated. In it more serious tools will be incorporated that will fit your needs better.
Closing as this won't be addressed in the (3-6) months to come. Apologies if it's a really wanted feature
Consolidating feedback regarding informational lines here, and closing per-scraper feedback (#712).
Issue
Currently
scraper.ingredients()
includes informational lines that do not represent ingredients. See below for examples. These lines aren't ingredients, but carry information about preparation, most often by grouping ingredients.Possible resolutions
Group ingredients: Expose a new
scraper.grouped_ingredients()
that retains the grouping information available from some pages, or defaults to a single group. Each group could include information such as a title (e.g.,For the Dough
).Ignore informational lines: Spec that scrapers implement
scraper.ingredients()
in a manner that avoids informational lines if at all possible.Impacted scrapers
Examples have been identified for these scrapers:
Examples
For https://www.seriouseats.com/new-england-greek-style-pizza:
For https://www.finedininglovers.com/recipes/bbq-watermelon-sashimi-secrets-fine-dining: