When adding any protobuf library as a dependency to compile_modules_lib, we get a segmentation fault (at least in compile_modules_lib_test) rather than the expected behavior:
Fatal Python error: Segmentation fault
Thread 0x00007f3a20e00640 (most recent call first):
File "/usr/lib/python3.10/threading.py", line 324 in wait
File "/usr/lib/python3.10/threading.py", line 607 in wait
File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/worker/data_plane.py", line 255 in run
File "/usr/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
File "/usr/lib/python3.10/threading.py", line 973 in _bootstrap
Current thread 0x00007f3ad2b9d1c0 (most recent call first):
File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 1048 in _run_bundle
File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 811 in _execute_bundle
File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 483 in run_stages
File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 228 in run_via_runner_api
File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 204 in run_pipeline
File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/direct/direct_runner.py", line 128 in run_pipeline
File "/usr/local/lib/python3.10/dist-packages/apache_beam/pipeline.py", line 587 in run
File "/usr/local/lib/python3.10/dist-packages/apache_beam/pipeline.py", line 563 in run
File "/usr/local/lib/python3.10/dist-packages/apache_beam/testing/test_pipeline.py", line 115 in run
File "/usr/local/lib/python3.10/dist-packages/apache_beam/pipeline.py", line 613 in __exit__
File "/root/.cache/bazel/_bazel_root/5b63f27bc35a3d0572c069ebf1768159/sandbox/linux-sandbox/8748/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/datasets/pipelines/compile_modules_lib_test.runfiles/com_google_gematria/gematria/datasets/pipelines/compile_modules_lib_test.py", line 129 in test_get_bbs
File "/usr/lib/python3.10/unittest/case.py", line 549 in _callTestMethod
File "/usr/lib/python3.10/unittest/case.py", line 591 in run
File "/usr/lib/python3.10/unittest/case.py", line 650 in __call__
File "/usr/lib/python3.10/unittest/suite.py", line 122 in run
File "/usr/lib/python3.10/unittest/suite.py", line 84 in __call__
File "/usr/lib/python3.10/unittest/suite.py", line 122 in run
File "/usr/lib/python3.10/unittest/suite.py", line 84 in __call__
File "/usr/lib/python3.10/unittest/runner.py", line 184 in run
File "/usr/lib/python3.10/unittest/main.py", line 271 in runTests
File "/usr/lib/python3.10/unittest/main.py", line 101 in __init__
File "/usr/local/lib/python3.10/dist-packages/absl/testing/absltest.py", line 2653 in _run_and_get_tests_result
File "/usr/local/lib/python3.10/dist-packages/absl/testing/absltest.py", line 2689 in run_tests
File "/usr/local/lib/python3.10/dist-packages/absl/testing/absltest.py", line 2234 in main_function
File "/usr/local/lib/python3.10/dist-packages/absl/app.py", line 254 in _run_main
File "/usr/local/lib/python3.10/dist-packages/absl/app.py", line 308 in run
File "/usr/local/lib/python3.10/dist-packages/absl/testing/absltest.py", line 2236 in _run_in_app
File "/usr/local/lib/python3.10/dist-packages/absl/testing/absltest.py", line 2131 in main
File "/root/.cache/bazel/_bazel_root/5b63f27bc35a3d0572c069ebf1768159/sandbox/linux-sandbox/8748/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/datasets/pipelines/compile_modules_lib_test.runfiles/com_google_gematria/gematria/datasets/pipelines/compile_modules_lib_test.py", line 142 in <module>
Extension modules: google.protobuf.pyext._message, google3.net.proto2.python.internal.cpp._message, apache_beam.coders.stream, grpc._cython.cygrpc, apache_beam.utils.windowed_value, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, fastavro._logical_readers, fastavro._schema, zstandard.backend_c, fastavro._read, fastavro._logical_writers, fastavro._validation, fastavro._write, pyarrow.lib, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.strptime, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.lib, pyarrow._compute, pandas._libs.ops, pandas._libs.hashing, pandas._libs.arrays, pandas._libs.tslib, pandas._libs.sparse, pandas._libs.internals, pandas._libs.indexing, pandas._libs.index, pandas._libs.writers, pandas._libs.join, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.groupby, pandas._libs.json, pandas._libs.parsers, pandas._libs.testing, apache_beam.coders.coder_impl, apache_beam.transforms.cy_dataflow_distribution_counter, apache_beam.transforms.cy_combiners, charset_normalizer.md, apache_beam.utils.counters, apache_beam.runners.common, apache_beam.transforms.stats, apache_beam.metrics.cells, apache_beam.runners.worker.statesampler_fast, apache_beam.metrics.execution, bson._cbson, pymongo._cmessage, pyarrow._parquet, pyarrow._fs, pyarrow._azurefs, pyarrow._hdfs, pyarrow._gcsfs, pyarrow._s3fs, crcmod._crcfunext, regex._regex, apache_beam.runners.worker.opcounters, apache_beam.runners.worker.operations (total: 89)
The main difference between the two setups seems to be that without any proto dependency defined in bazel, we use the system-installed protobuf version, whereas when we have a proto dependency specified, we use the bazel-managed protobuf version and automatically use the cpp backend. I have not been able to see if the issue reproduces with the system installed protobuf with the cpp backend as it is not installed by default.
When adding any protobuf library as a dependency to
compile_modules_lib
, we get a segmentation fault (at least incompile_modules_lib_test
) rather than the expected behavior:The main difference between the two setups seems to be that without any proto dependency defined in bazel, we use the system-installed protobuf version, whereas when we have a proto dependency specified, we use the bazel-managed protobuf version and automatically use the cpp backend. I have not been able to see if the issue reproduces with the system installed protobuf with the cpp backend as it is not installed by default.