apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.87k stars 4.26k forks source link

[Bug]: Python BQ directrunner postcommits flake with KeyError: '__pyx_vtable__' #29617

Open tvalentyn opened 11 months ago

tvalentyn commented 11 months ago

What happened?

Sample error:

apache_beam/io/gcp/bigquery_read_it_test.py:451: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
apache_beam/pipeline.py:612: in __exit__
    self.result = self.run()
apache_beam/pipeline.py:559: in run
    return Pipeline.from_runner_api(
apache_beam/pipeline.py:586: in run
    return self.runner.run_pipeline(self, self._options)
apache_beam/runners/direct/test_direct_runner.py:42: in run_pipeline
    self.result = super().run_pipeline(pipeline, options)
apache_beam/runners/direct/direct_runner.py:117: in run_pipeline
    from apache_beam.runners.portability.fn_api_runner import fn_runner
apache_beam/runners/portability/fn_api_runner/__init__.py:18: in <module>
    from apache_beam.runners.portability.fn_api_runner.fn_runner import FnApiRunner
apache_beam/runners/portability/fn_api_runner/fn_runner.py:66: in <module>
    from apache_beam.runners.portability.fn_api_runner import execution
apache_beam/runners/portability/fn_api_runner/execution.py:61: in <module>
    from apache_beam.runners.portability.fn_api_runner import translations
apache_beam/runners/portability/fn_api_runner/translations.py:55: in <module>
    from apache_beam.runners.worker import bundle_processor
apache_beam/runners/worker/bundle_processor.py:69: in <module>
    from apache_beam.runners.worker import operations
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   KeyError: '__pyx_vtable__'

apache_beam/runners/worker/operations.py:1: KeyError

Sample error: https://ci-beam.apache.org/job/beam_PostCommit_Python39/2615/testReport/junit/apache_beam.io.gcp.bigquery_read_it_test/ReadUsingStorageApiTests/test_iobase_source_2/

Error seems to be fairly frequent.

Next steps: try to repro locally and understand what started to trigger it.

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

riteshghorse commented 11 months ago

Failing run - https://ci-beam.apache.org/job/beam_PostCommit_Python39/2615/consoleText

riteshghorse commented 11 months ago

Surely the key error is from the generated Cython code. But the test seems flaky.

It failed in #2607 but passed in #2608

riteshghorse commented 11 months ago

Observing the same flakiness on python 3.8 - https://ci-beam.apache.org/job/beam_PostCommit_Python38/4897/consoleText

damccorm commented 11 months ago

Seems related to https://issues.apache.org/jira/browse/FLINK-28786?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel - maybe a consequence of the underlying environment? I haven't seen this fail in GHA yet - https://github.com/apache/beam/actions/workflows/beam_PostCommit_Python.yml - and may defer until https://github.com/apache/beam/issues/29684 is fixed to get a better signal

tvalentyn commented 5 months ago

understand what started to trigger it.

One possibility is that we started to use wheels in our integration tests, and perhaps the built wheel is not compatible with GHA platform, or it is not built reliably, or somehow broken.

To investigate further we could:

1) Create a PR that limits the postcommit suite to only direct runner tests, it should be a modification to gradle files in https://github.com/apache/beam/blob/e764fc9c17dbc575d4a16b5e6454f4d38ef752be/build.gradle.kts#L506 , and this modification will be picked up by GHA runner when a non-committer modifies the trigger file: https://github.com/apache/beam/blob/master/.github/trigger_files/beam_PostCommit_Python.json

Then, try to repro the error with a shorter feedback loop on GHA or maybe even repro locally.

2) It would help to print or inspect sha of the created wheel, and see if it stays consistent between passing and failing runs of this test, when there are no other changes to the code.

3) We could see if error disappears if we use sources instead of wheels. For example we can avoid this branch that builds the wheel: https://github.com/apache/beam/blob/e764fc9c17dbc575d4a16b5e6454f4d38ef752be/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L3066

note that logs below also say something about "reparing" the wheel, although it might be a red herring.

INFO:auditwheel.main_repair:Repairing apache_beam-2.57.0.dev0-cp39-cp39-linux_x86_64.whl
#9 156.8   /tmp/pip-build-env-x6bammmu/overlay/lib/python3.9/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-req-build-i5vxycdj/apache_beam/runners/worker/operations.pxd
#9 156.8     tree = Parsing.p_module(s, pxd, full_module_name)
INFO:auditwheel.wheeltools:Previous filename tags: linux_x86_64
INFO:auditwheel.wheeltools:New filename tags: manylinux_2_17_x86_64, manylinux2014_x86_64
INFO:auditwheel.wheeltools:Previous WHEEL info tags: cp39-cp39-linux_x86_64
INFO:auditwheel.wheeltools:New WHEEL info tags: cp39-cp39-manylinux_2_17_x86_64, cp39-cp39-manylinux2014_x86_64
#9 158.4   /tmp/pip-build-env-x6bammmu/overlay/lib/python3.9/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-req-build-i5vxycdj/apache_beam/runners/worker/statesampler_fast.pxd
#9 158.4     tree = Parsing.p_module(s, pxd, full_module_name)
#9 159.0   /tmp/pip-build-env-x6bammmu/overlay/lib/python3.9/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-req-build-i5vxycdj/apache_beam/testing/fast_test_utils.pxd
#9 159.0     tree = Parsing.p_module(s, pxd, full_module_name)
#9 159.1   /tmp/pip-build-env-x6bammmu/overlay/lib/python3.9/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-req-build-i5vxycdj/apache_beam/transforms/cy_combiners.pxd
#9 159.1     tree = Parsing.p_module(s, pxd, full_module_name)
#9 159.7   /tmp/pip-build-env-x6bammmu/overlay/lib/python3.9/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-req-build-i5vxycdj/apache_beam/transforms/cy_dataflow_distribution_counter.pxd
#9 159.7     tree = Parsing.p_module(s, pxd, full_module_name)
#9 159.8   /tmp/pip-build-env-x6bammmu/overlay/lib/python3.9/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-req-build-i5vxycdj/apache_beam/transforms/stats.pxd
#9 159.8     tree = Parsing.p_module(s, pxd, full_module_name)
#9 160.1   /tmp/pip-build-env-x6bammmu/overlay/lib/python3.9/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-req-build-i5vxycdj/apache_beam/utils/counters.pxd
#9 160.1     tree = Parsing.p_module(s, pxd, full_module_name)
INFO:auditwheel.main_repair:
Fixed-up wheel written to /tmp/cibuildwheel/repaired_wheel/apache_beam-2.57.0.dev0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl