Closed jayaddison closed 6 days ago
This returns a lengthy output during a run of python -m unittest
, is this the expected behavior?
(.venv) % python -m unittest
....................................../recipe_scrapers/bestrecipes.py:19: UserWarning: bestrecipes.com.au doesn't seem to provide language metadata in their HTML. Please let us know if it becomes available in a standard location, and then we can try to retrieve it dynamically.
warnings.warn(msg)
................................................................................................................................................................................................................................./recipe_scrapers/potatorolls.py:52: UserWarning: potatorolls.com appears to write language metadata into HTML DOCTYPE declaration instead of the top-level 'html' element or elsewhere. Please let us know if it becomes available in a standard location, and then we can try to retrieve it dynamically.
warnings.warn(msg)
........................................................................................................................................................................................................................................................................................................
----------------------------------------------------------------------
Ran 559 tests in 12.732s
OK
It's kinda-expected that it causes noise in the unittest
output, yep. I'd wondered about that in another pull request, #1067, when discussing options for logging warnings for statically-defined fields (fields that only ever return a single value).
I had an idea and have pushed a commit (ff411e14e067fed809b18a2fa9be20622c3e174f) to implement it: we could ignore warnings from the scraper fields by default, but allow unit tests to be run with warning filters enabled, and add failures in the latter case when warnings occur.
Two drawbacks to that, though: it only works for Python3.11+, and also it means that the warnings are entirely hidden (no hints/reminders about them at all) when no Python warning options are configured.
$ python -m unittest -k potato
..
----------------------------------------------------------------------
Ran 2 tests in 0.172s
OK
$ python -Walways -m unittest -k potato
FF
======================================================================
FAIL: tests/test_data/potatorolls.com/potatorolls_1.json (tests.RecipeTestCase.tests/test_data/potatorolls.com/potatorolls_1.json) [language]
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/jka/Documents/reciperadar/recipe-scrapers/tests/__init__.py", line 110, in test_func
self.fail(msg=msg)
AssertionError: .language() emitted warnings: UserWarning
======================================================================
FAIL: tests/test_data/potatorolls.com/potatorolls_2.json (tests.RecipeTestCase.tests/test_data/potatorolls.com/potatorolls_2.json) [language]
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/jka/Documents/reciperadar/recipe-scrapers/tests/__init__.py", line 110, in test_func
self.fail(msg=msg)
AssertionError: .language() emitted warnings: UserWarning
----------------------------------------------------------------------
Ran 2 tests in 0.178s
FAILED (failures=2)
Although the warnings-related discussion is valuable, I think it's better to narrow this changeset down to only the one change that it claims to make, and so I've reverted my modifications to add warning-filtering/handling in the unit tests.
Instead of attempting to add clever self-adaptive warning filtering logic within the codebase itself, I think that following common usage patterns for Python warning filtering would make more sense.
Typically the -W
argument to the Python interpreter, and/or the PYTHONWARNINGS
environment variable are provided for that purpose.
In particular, I think that something along the lines of PYTHONWARNINGS='ignore::Warning:recipe_scrapers.plugins.static_values'
environment variable would be an example of using standard Python functionality to filter out only the static-value-related warnings when our unit tests run.
...and raise an
ElementNotFoundInHtml
exception to try to help us resolve cases where it isn't possible.cc @rmdluo @mlduff because we've been discussing how to handle and emit warning messages.