hhursev / recipe-scrapers

Python package for scraping recipes data
MIT License
1.6k stars 505 forks source link

Idea/suggestion: refactor: describe unavailable/static fields using exceptions. #1132

Closed jayaddison closed 4 days ago

jayaddison commented 1 month ago

This would allow us to describe in code:

In both cases, a placeholder return_value can be specified -- and by default that's what the calling code will receive.

If some developer/user wants to adjust that -- maybe they don't want to allow any static/unknown values appearing in their application -- they can disable the StaticValueExceptionHandlingPlugin plugin.

Credits

This suggestion includes ideas from discussion with @mlduff and @rmdluo, particulary from pull requests #1067 and #1098.

Notes

Currently this does also emit warnings in our unittest output. See #1112 for an idea about how that could be adjusted in future.

$ python -m unittest -k davidlebovitz
/home/jka/Documents/reciperadar/recipe-scrapers/recipe_scrapers/plugins/static_values.py:55: FieldNotProvidedByWebsiteWarning: davidlebovitz.com doesn't seem to support the total_time field. If you know this to be untrue for some recipe, please submit a bug report at https://github.com/hhursev/recipe-scrapers/issues
  warnings.warn(
.
----------------------------------------------------------------------
Ran 1 test in 0.142s

OK
mlduff commented 1 month ago

This looks really good - would we want to also use it for the author attribute on sites where it has just been hard-coded? Or since this is more inferred from the site itself (especially on sites where it is one person's blog), there is no need?

jayaddison commented 1 month ago

This looks really good - would we want to also use it for the author attribute on sites where it has just been hard-coded? Or since this is more inferred from the site itself (especially on sites where it is one person's blog), there is no need?

Yep, that's a good use-case for this too. It seems we have twenty or so author fields that return static string values. I'll add some changes to this branch soon to migrate them to use StaticValueException too.

jayaddison commented 1 month ago

Perhaps refactoring the exceptions into decorators would provide an easier way to indicate affected methods? That way, we could add/remove the decorators without changes to the code inside the original method.