apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.88k stars 4.26k forks source link

[Failing Test]: beam_PreCommit_Python_Coverage suite fails with ModuleNotFoundError: No module named 'pydantic._hypothesis_plugin' and `pip check` failures #30852

Closed tvalentyn closed 7 months ago

tvalentyn commented 7 months ago

What happened?

The 'coverage' suite runs some Beam unit tests in environments with different versions a particular dependency, for example we test severalversions of pyarrow or tft. The py38-tft-113 suite currently fails, likely due to a incompatible dependencies in tox environment:

============================= test session starts ==============================
Plugin: terminalreporter, Hook: pytest_sessionfinish
ModuleNotFoundError: No module named 'pydantic._hypothesis_plugin'
For more information see https://pluggy.readthedocs.io/en/stable/api_reference.html#pluggy.PluggyTeardownRaisedWarning
  config.hook.pytest_sessionfinish(
Traceback (most recent call last):
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/bin/pytest", line 10, in <module>
    sys.exit(console_main())
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/_pytest/config/__init__.py", line 192, in console_main
    code = main()
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/_pytest/config/__init__.py", line 169, in main
    ret: Union[ExitCode, int] = config.hook.pytest_cmdline_main(
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/pluggy/_hooks.py", line 501, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/pluggy/_manager.py", line 119, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/pluggy/_callers.py", line 138, in _multicall
    raise exception.with_traceback(exception.__traceback__)
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/pluggy/_callers.py", line 102, in _multicall
    res = hook_impl.function(*args)
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/_pytest/main.py", line 318, in pytest_cmdline_main
    return wrap_session(config, _main)
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/_pytest/main.py", line 306, in wrap_session
    config.hook.pytest_sessionfinish(
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/pluggy/_hooks.py", line 501, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/pluggy/_manager.py", line 119, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/pluggy/_callers.py", line 155, in _multicall
    teardown[0].send(outcome)
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/_pytest/terminal.py", line 867, in pytest_sessionfinish
    self.config.hook.pytest_terminal_summary(
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/pluggy/_hooks.py", line 501, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/pluggy/_manager.py", line 119, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/pluggy/_callers.py", line 181, in _multicall
    return outcome.get_result()
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/pluggy/_result.py", line 99, in get_result
    raise exc.with_traceback(exc.__traceback__)
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/pluggy/_callers.py", line 102, in _multicall
    res = hook_impl.function(*args)
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/_hypothesis_pytestplugin.py", line 391, in pytest_terminal_summary
    from hypothesis.internal.observability import _WROTE_TO
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/_pytest/assertion/rewrite.py", line 186, in exec_module
    exec(co, module.__dict__)
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/hypothesis/__init__.py", line 57, in <module>
    run()
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/hypothesis/entry_points.py", line 35, in run
    hook = entry.load()
  File "/runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/lib/python3.8/site-packages/setuptools/_vendor/importlib_metadata/__init__.py", line 208, in load
    module = import_module(match.group('module'))
  File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named 'pydantic._hypothesis_plugin'
platform linux -- Python 3.8.18, pytest-7.4.4, pluggy-1.4.0
cachedir: target/.tox-py38-tft-113/py38-tft-113/.pytest_cache
rootdir: /runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python
configfile: pytest.ini
plugins: xdist-3.5.0, timeout-2.3.1, requests-mock-1.12.1, hypothesis-6.100.0
timeout: 600.0s
timeout method: signal
timeout func_only: False
collected 129 items / 129 deselected / 0 selected

- generated xml file: /runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/pytest_py38-tft-113_no_xdist.xml -
py38-tft-113: exit 1 (86.04 seconds) /runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python> bash /runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/scripts/run_pytest.sh py38-tft-113 'apache_beam/ml/transforms apache_beam/examples/snippets/transforms/elementwise/mltransform_test.py' pid=63936
py38-tft-113: commands_post[0]> bash /runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/scripts/run_tox_cleanup.sh
  py38-tft-113: FAIL code 1 (420.51=setup[332.64]+cmd[0.01,0.33,1.44,0.03,86.04,0.03] seconds)
  evaluation failed :( (420.70 seconds)

Clues are in dependencies that were installed:

py38-tft-113: install_package_deps> target/.tox-py38-tft-113/py38-tft-113/bin/python /runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/bin/pip install --retries 10 --pre 'cloudpickle~=2.2.1' 'crcmod<2.0,>=1.7' 'cryptography>=41.0.2' 'dill<0.3.2,>=0.3.1.1' 'docstring-parser<1.0,>=0.15' 'fastavro<2,>=0.23.6' 'fasteners<1.0,>=0.3' 'freezegun>=0.3.12' 'grpcio!=1.48.0,<2,>=1.33.1' 'hdfs<3.0.0,>=2.1.0' 'httplib2<0.23.0,>=0.8' 'hypothesis<=7.0.0,>5.0.0' 'joblib>=1.0.1' 'js2py<1,>=0.74; python_version < "3.12"' 'jsonpickle<4.0.0,>=3.0.0' 'jsonschema<5.0.0,>=4.0.0' 'mock<6.0.0,>=1.0.1' 'numpy<1.27.0,>=1.14.3' 'objsize<0.8.0,>=0.6.1' 'orjson<4,>=3.9.7' 'packaging>=22.0' 'pandas!=1.5.0,!=1.5.1,<2.1,>=1.4.3; python_version >= "3.8"' 'pandas<2.1.0' 'parameterized<0.10.0,>=0.7.1' 'proto-plus<2,>=1.7.1' 'protobuf!=4.0.*,!=4.21.*,!=4.22.0,!=4.23.*,!=4.24.*,<4.26.0,>=3.20.3' 'psycopg2-binary<3.0.0,>=2.8.5' 'pyarrow-hotfix<1' 'pyarrow<15.0.0,>=3.0.0' 'pydot<2,>=1.2.0' 'pyhamcrest!=1.10.0,<3.0.0,>=1.9' 'pymongo<5.0.0,>=3.8.0' 'pytest-timeout<3,>=2.1.0' 'pytest-xdist<4,>=2.5.0' 'pytest<8.0,>=7.1.2' 'python-dateutil<3,>=2.8.0' 'pytz>=2018.3' 'pyyaml<7.0.0,>=3.12' 'redis<6,>=5.0.0' 'regex>=2020.6.8' 'requests-mock<2.0,>=1.7' 'requests<3.0.0,>=2.24.0' 'scikit-learn>=0.20.0' 'sqlalchemy<2.0,>=1.3' 'tenacity<9,>=8.0.0' 'testcontainers[mysql]<4.0.0,>=3.0.3' 'typing-extensions>=3.7.0' 'zstandard<1,>=0.18.0'
py38-tft-113: install_package> target/.tox-py38-tft-113/py38-tft-113/bin/python /runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/bin/pip install --retries 10 --pre --force-reinstall --no-deps /runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/.tmp/package/1/apache-beam-2.56.0.dev0.tar.gz
py38-tft-113: freeze> target/.tox-py38-tft-113/py38-tft-113/bin/python /runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/py38-tft-113/bin/pip freeze
py38-tft-113: absl-py==1.4.0,annotated-types==0.6.0,apache-beam @ file:///runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-tft-113/.tmp/package/1/apache-beam-2.56.0.dev0.tar.gz#sha256=af9d8c12f56cce760e8ef3014f6edc25ec24d2a84ffa708ff45ca19e2fe866bd,astunparse==1.6.3,async-timeout==4.0.3,attrs==23.2.0,backports.zoneinfo==0.2.1,cachetools==5.3.3,certifi==2024.2.2,cffi==1.16.0,charset-normalizer==3.3.2,cloudpickle==2.2.1,crcmod==1.7,cryptography==42.0.5,Deprecated==1.2.14,deprecation==2.1.0,dill==0.3.1.1,dnspython==2.6.1,docker==7.0.0,docopt==0.6.2,docstring_parser==0.16,exceptiongroup==1.2.0,execnet==2.0.2,fastavro==1.9.4,fasteners==0.19,flatbuffers==24.3.25,freezegun==1.4.0,gast==0.4.0,google-api-core==2.18.0,google-api-python-client==1.12.11,google-apitools==0.5.31,google-auth==2.29.0,google-auth-httplib2==0.1.1,google-auth-oauthlib==1.0.0,google-cloud-aiplatform==1.46.0,google-cloud-bigquery==3.20.1,google-cloud-bigquery-storage==2.24.0,google-cloud-bigtable==2.23.0,google-cloud-core==2.4.1,google-cloud-datastore==2.19.0,google-cloud-dlp==3.16.0,google-cloud-language==2.13.3,google-cloud-pubsub==2.21.0,google-cloud-pubsublite==1.9.0,google-cloud-recommendations-ai==0.10.10,google-cloud-resource-manager==1.12.3,google-cloud-spanner==3.44.0,google-cloud-storage==2.16.0,google-cloud-videointelligence==2.13.3,google-cloud-vision==3.7.2,google-crc32c==1.5.0,google-pasta==0.2.0,google-resumable-media==2.7.0,googleapis-common-protos==1.63.0,greenlet==3.0.3,grpc-google-iam-v1==0.13.0,grpc-interceptor==0.15.4,grpcio==1.62.1,grpcio-status==1.62.1,h5py==3.10.0,hdfs==2.7.3,httplib2==0.22.0,hypothesis==6.100.0,idna==3.6,importlib_metadata==7.1.0,importlib_resources==6.4.0,iniconfig==2.0.0,joblib==1.3.2,Js2Py==0.74,jsonpickle==3.0.3,jsonschema==4.21.1,jsonschema-specifications==2023.12.1,keras==2.13.1,libclang==18.1.1,Markdown==3.6,MarkupSafe==2.1.5,mock==5.1.0,numpy==1.22.4,oauth2client==4.1.3,oauthlib==3.2.2,objsize==0.7.0,opt-einsum==3.3.0,orjson==3.10.0,overrides==7.7.0,packaging==24.0,pandas==1.5.3,parameterized==0.9.0,pkgutil_resolve_name==1.3.10,pluggy==1.4.0,proto-plus==1.24.0.dev0,protobuf==4.25.3,psycopg2-binary==2.9.9,pyarrow==6.0.1,pyarrow-hotfix==0.6,pyasn1==0.6.0,pyasn1_modules==0.4.0,pycparser==2.22,pydantic==2.0a4,pydantic_core==0.30.0,pydot==1.4.2,PyHamcrest==2.1.0,pyjsparser==2.7.1,pymongo==4.6.3,PyMySQL==1.1.0,pyparsing==3.1.2,pytest==7.4.4,pytest-timeout==2.3.1,pytest-xdist==3.5.0,python-dateutil==2.9.0.post0,pytz==2024.1,PyYAML==6.0.1,redis==5.1.0b4,referencing==0.34.0,regex==2023.12.25,requests==2.31.0,requests-mock==1.12.1,requests-oauthlib==2.0.0,rpds-py==0.18.0,rsa==4.9,scikit-learn==1.3.2,scipy==1.10.1,shapely==2.0.3,six==1.16.0,sortedcontainers==2.4.0,SQLAlchemy==1.4.52,sqlparse==0.4.4,tenacity==8.2.3,tensorboard==2.13.0,tensorboard-data-server==0.7.2,tensorflow==2.13.1,tensorflow-estimator==2.13.0,tensorflow-io-gcs-filesystem==0.34.0,tensorflow-metadata==1.13.1,tensorflow-serving-api==2.13.1,tensorflow-transform==1.13.0,termcolor==2.4.0,testcontainers==3.7.1,tfx-bsl==1.13.0,threadpoolctl==3.4.0,tomli==2.0.1,typing_extensions==4.5.0,tzlocal==5.2,uritemplate==3.0.1,urllib3==2.2.1,Werkzeug==3.0.2,wrapt==1.16.0,zipp==3.18.1,zstandard==0.22.0

Issue Failure

Failure: Test is continually failing

Issue Priority

Priority: 1 (unhealthy code / failing or flaky postcommit so we cannot be sure the product is healthy)

Issue Components

tvalentyn commented 7 months ago

Sample error: https://github.com/apache/beam/actions/runs/8548288264/job/23421775819?pr=30843

tvalentyn commented 7 months ago

Not sure how pydantic comes in the picture here yet, but for 'pydantic._hypothesis_plugin' to be available, it seems that we need to have 1.0.0<pydantic<2.0.0

tvalentyn commented 7 months ago

Beam test infra installs pre-released dependencies to detect possible issues ahead of releases. The comand:

 pip install --pre "tensorflow_transform>=1.13.0,<1.14.0" apache-beam[gcp,test]

installs pydantic==2.0a4

The command

 pip install  "tensorflow_transform>=1.13.0,<1.14.0" apache-beam[gcp,test]

installs pydantic==1.10.15

The tft requirement comes from: https://github.com/apache/beam/blob/21129a41e031c150c3f610639d71a95a3a941243/sdks/python/tox.ini#L316

tvalentyn commented 7 months ago

from pipdeptree:

* google-cloud-aiplatform==1.46.0
 - proto-plus [required: >=1.22.0,<2.0.0dev, installed: 1.24.0.dev0]
 - pydantic [required: <3, installed: 2.0a4]

this seems to be an incorrect installer behavior. 2.0a4 shouln't be installed under these constraints.

not sure how that happens.

Likely what triggered the error for us was a recent release in https://pypi.org/project/google-cloud-aiplatform/#history , which added the pydantic dependency.

tvalentyn commented 7 months ago

pydantic [required: <3, installed: 2.0a4] this seems to be an incorrect installer behavior. 2.0a4 shouln't be installed under these constraints.

Actually I misread this, it still fits the range but the chosen version is strange, there might be more constraints.

tvalentyn commented 7 months ago

even though pip selects a bizzare version for pydantic, pydantic 2, this pydantic-hypothesis plugin seems broken

from pydantic import _hypothesis_plugin fails, the more correct import from pydantic.v1 import _hypothesis_plugin also fails with recent versions of hypthesis

tvalentyn commented 7 months ago

https://stackoverflow.com/questions/71394400/how-to-block-the-hypothesis-pytest-plugin has some discussion how to disable it

tvalentyn commented 7 months ago

The hypothesis plugin might be provided by hypothesis itself, and we might need it for tests that use hypothesis. But looks like Pydantic 2 doesn't work with hypothesis: https://github.com/pydantic/pydantic/discussions/5979 , and somehow pydantic (which i believe we didn't have in our dependency chain before), now intefers with pytest/hypothesis. the fact that old version of pydantic gets installed due to --pre flag may or may not be a factor.

tvalentyn commented 7 months ago

the fact that old version of pydantic gets installed due to --pre flag may or may not be a factor.

that is the factor. The pydantic-2.0a4 distribution has the following:

(py38b) :py38b$ cat lib/python3.8/site-packages/pydantic-2.0a4.dist-info/entrypoints.txt [hypothesis] = pydantic._hypothesis_plugin

tvalentyn commented 7 months ago

One other problem that causes additional failures in this suite is that in the test environment we first install test environment dependencies (e.g., tensorflow), then install the Beam package. This has implication on dependency resolution, and pip fails to resolve the conflicts.

We might be able to prevent that if we install both deps in the same command, asked on https://github.com/tox-dev/tox/issues/2386#issuecomment-2040435212 if that is possible.