Closed jayaddison closed 2 months ago
Most of this functionality can be achieved simply by moving ingredient_groups
from the MANDATORY_TESTS
list into the OPTIONAL_TESTS
list, and I think that's the most effective way to resolve this bug.
However, it has made me wonder about something during implementation/investigation: perhaps we should require fields -- even in the OPTIONAL_TESTS
list -- to have a matching test data entry when they override the default implementation.
Detecting whether a method is overridden can be somewhat complicated due to the presence of plugin-related decorators (ExampleScraper.nutrients == AbstractScraper.nutrients
may return False
even when there is no override for nutrients on ExampleScraper
).
A short one-liner script to more-or-less determine scrapers that have not overridden ingredient_groups
:
print('\n'.join(sorted({scraper.host() for scraper in recipe_scrapers.SCRAPERS.values() if hasattr(scraper, 'ingredient_groups') and scraper.ingredient_groups == recipe_scrapers.AbstractScraper.ingredient_groups})))
Edit: de-duplicate, and print results per-line.
...plus a little bash scripting open the relevant filenames for editing...
vim $(python -c "import recipe_scrapers; print('\n'.join(sorted({scraper.host() for scraper in recipe_scrapers.SCRAPERS.values() if hasattr(scraper, 'ingredient_groups') and scraper.ingredient_groups == recipe_scrapers.AbstractScraper.ingredient_groups})))" | while read line; do grep -l ingredient_groups tests/test_data/$line/*.json; done )
...find-and-remove entries with some vim
pattern-matching:
/"ingredient_groups" # find ingredient groups in the current file
d% # delete to the end of the current bracketed-group (matches the `[` on each line and deletes up to the corresponding `]`)
dd # cleanup the remaining doublequotes and trailing comma that remain
(repeated a lot -- I ended up skipping the dd
and doing those in batch later via :bufdo!%s/^ ",\n//g|wn
)
The fact that the data-driven JSON tests have an expectation on the
ingredient_groups
field makes it a bit more challenging to write new scrapers, I think. My sense is that it's not the most initially-intuitive of fields for beginners -- partly because it duplicates another field, and partly because of the way it inter-relates with that other field (ingredients must match, and yet must be grouped -- and although the utility functions make the code for that relatively concise, the details can be a bit complex to understand).