buriy / python-readability

fast python port of arc90's readability tool, updated to match latest readability.js!
https://github.com/buriy/python-readability
Apache License 2.0
2.65k stars 348 forks source link

`test_many_repeated_spaces` fails on darwin python 3.8, 3.9 & 3.10 #177

Closed MaxDaten closed 1 year ago

MaxDaten commented 1 year ago

Steps to reproduce:

python -m venv venv
source venv/bin/activate
pip install tox
tox
Error Log

``` py310: commands[1]> py.test ================================================================================================================================================ test session starts ================================================================================================================================================= platform darwin -- Python 3.10.10, pytest-7.3.1, pluggy-1.0.0 cachedir: .tox/py310/.pytest_cache rootdir: /Users/jloos/Developer/python-readability collected 11 items tests/test_article_only.py ....F...... [100%] ====================================================================================================================================================== FAILURES ====================================================================================================================================================== _____________________________________________________________________________________________________________________________________ TestArticleOnly.test_many_repeated_spaces ______________________________________________________________________________________________________________________________________ .tox/py310/lib/python3.10/site-packages/timeout_decorator/timeout_decorator.py:92: in new_function return timeout_wrapper(*args, **kwargs) .tox/py310/lib/python3.10/site-packages/timeout_decorator/timeout_decorator.py:147: in __call__ self.__process.start() /nix/store/7rjqb838snvvxcmpvck1smfxhkwzqal5-python3-3.10.10/lib/python3.10/multiprocessing/process.py:121: in start self._popen = self._Popen(self) /nix/store/7rjqb838snvvxcmpvck1smfxhkwzqal5-python3-3.10.10/lib/python3.10/multiprocessing/context.py:224: in _Popen return _default_context.get_context().Process._Popen(process_obj) /nix/store/7rjqb838snvvxcmpvck1smfxhkwzqal5-python3-3.10.10/lib/python3.10/multiprocessing/context.py:288: in _Popen return Popen(process_obj) /nix/store/7rjqb838snvvxcmpvck1smfxhkwzqal5-python3-3.10.10/lib/python3.10/multiprocessing/popen_spawn_posix.py:32: in __init__ super().__init__(process_obj) /nix/store/7rjqb838snvvxcmpvck1smfxhkwzqal5-python3-3.10.10/lib/python3.10/multiprocessing/popen_fork.py:19: in __init__ self._launch(process_obj) /nix/store/7rjqb838snvvxcmpvck1smfxhkwzqal5-python3-3.10.10/lib/python3.10/multiprocessing/popen_spawn_posix.py:47: in _launch reduction.dump(process_obj, fp) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ obj = , file = <_io.BytesIO object at 0x107bfbf60>, protocol = None def dump(obj, file, protocol=None): '''Replacement for pickle.dump() using ForkingPickler.''' > ForkingPickler(file, protocol).dump(obj) E _pickle.PicklingError: Can't pickle : it's not the same object as tests.test_article_only.TestArticleOnly.test_many_repeated_spaces /nix/store/7rjqb838snvvxcmpvck1smfxhkwzqal5-python3-3.10.10/lib/python3.10/multiprocessing/reduction.py:60: PicklingError ============================================================================================================================================== short test summary info =============================================================================================================================================== FAILED tests/test_article_only.py::TestArticleOnly::test_many_repeated_spaces - _pickle.PicklingError: Can't pickle : it's not the same object as tests.test_article_only.TestArticleOnly.test_many_repeated_spaces ============================================================================================================================================ 1 failed, 10 passed in 0.14s ============================================================================================================================================ py310: exit 1 (0.28 seconds) /Users/jloos/Developer/python-readability> py.test pid=31539 py310: FAIL ✖ in 1.43 seconds pypy: skipped because could not find python interpreter with spec(s): pypy pypy: SKIP ⚠ in 0 seconds pypy3: skipped because could not find python interpreter with spec(s): pypy3 pypy3: SKIP ⚠ in 0 seconds doc: install_package> python -I -m pip install --force-reinstall --no-deps /Users/jloos/Developer/python-readability/.tox/.tmp/package/43/readability-lxml-0.8.2.tar.gz doc: commands[0]> python setup.py build_sphinx running build_sphinx /Users/jloos/Developer/python-readability/.tox/doc/lib/python3.8/site-packages/setuptools/_distutils/dist.py:988: RemovedInSphinx70Warning: setup.py build_sphinx is deprecated. cmd_obj.run() Running Sphinx v6.1.3 WARNING: Invalid configuration value found: 'language = None'. Update your configuration to a valid language code. Falling back to 'en' (English). loading pickled environment... done building [mo]: targets for 0 po files that are out of date writing output... building [html]: targets for 0 source files that are out of date updating environment: 0 added, 0 changed, 0 removed reading sources... looking for now-outdated files... none found no targets are out of date. build succeeded, 1 warning. The HTML pages are in build/sphinx/html. .pkg: _exit> python /Users/jloos/Developer/python-readability/venv/lib/python3.8/site-packages/pyproject_api/_backend.py True setuptools.build_meta __legacy__ py27: SKIP (0.01 seconds) py35: SKIP (0.01 seconds) py36: SKIP (0.00 seconds) py37: SKIP (0.00 seconds) py38: FAIL code 1 (2.07=setup[0.81]+cmd[0.58,0.68] seconds) py39: FAIL code 1 (2.15=setup[1.24]+cmd[0.60,0.31] seconds) py310: FAIL code 1 (1.43=setup[0.60]+cmd[0.55,0.28] seconds) pypy: SKIP (0.00 seconds) pypy3: SKIP (0.00 seconds) doc: OK (1.04=setup[0.60]+cmd[0.44] seconds) evaluation failed :( (6.77 seconds) ```

Changing

https://github.com/buriy/python-readability/blob/master/tests/test_article_only.py#L104:L104

to

    @timeout_decorator.timeout(seconds=3, use_signals=True)

the test is green but with caveats, it might bring.

Or changing the timeout library to https://github.com/bitranox/wrapt_timeout_decorator

See patch: wrapt_timeout_decorator.patch