google / gematria

Machine learning for machine code.
Apache License 2.0
76 stars 11 forks source link

Segmentation fault with cpp protos in compile_modules_lib #201

Open boomanaiden154 opened 2 months ago

boomanaiden154 commented 2 months ago

When adding any protobuf library as a dependency to compile_modules_lib, we get a segmentation fault (at least in compile_modules_lib_test) rather than the expected behavior:

Fatal Python error: Segmentation fault

Thread 0x00007f3a20e00640 (most recent call first):
  File "/usr/lib/python3.10/threading.py", line 324 in wait
  File "/usr/lib/python3.10/threading.py", line 607 in wait
  File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/worker/data_plane.py", line 255 in run
  File "/usr/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/usr/lib/python3.10/threading.py", line 973 in _bootstrap

Current thread 0x00007f3ad2b9d1c0 (most recent call first):
  File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 1048 in _run_bundle
  File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 811 in _execute_bundle
  File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 483 in run_stages
  File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 228 in run_via_runner_api
  File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 204 in run_pipeline
  File "/usr/local/lib/python3.10/dist-packages/apache_beam/runners/direct/direct_runner.py", line 128 in run_pipeline
  File "/usr/local/lib/python3.10/dist-packages/apache_beam/pipeline.py", line 587 in run
  File "/usr/local/lib/python3.10/dist-packages/apache_beam/pipeline.py", line 563 in run
  File "/usr/local/lib/python3.10/dist-packages/apache_beam/testing/test_pipeline.py", line 115 in run
  File "/usr/local/lib/python3.10/dist-packages/apache_beam/pipeline.py", line 613 in __exit__
  File "/root/.cache/bazel/_bazel_root/5b63f27bc35a3d0572c069ebf1768159/sandbox/linux-sandbox/8748/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/datasets/pipelines/compile_modules_lib_test.runfiles/com_google_gematria/gematria/datasets/pipelines/compile_modules_lib_test.py", line 129 in test_get_bbs
  File "/usr/lib/python3.10/unittest/case.py", line 549 in _callTestMethod
  File "/usr/lib/python3.10/unittest/case.py", line 591 in run
  File "/usr/lib/python3.10/unittest/case.py", line 650 in __call__
  File "/usr/lib/python3.10/unittest/suite.py", line 122 in run
  File "/usr/lib/python3.10/unittest/suite.py", line 84 in __call__
  File "/usr/lib/python3.10/unittest/suite.py", line 122 in run
  File "/usr/lib/python3.10/unittest/suite.py", line 84 in __call__
  File "/usr/lib/python3.10/unittest/runner.py", line 184 in run
  File "/usr/lib/python3.10/unittest/main.py", line 271 in runTests
  File "/usr/lib/python3.10/unittest/main.py", line 101 in __init__
  File "/usr/local/lib/python3.10/dist-packages/absl/testing/absltest.py", line 2653 in _run_and_get_tests_result
  File "/usr/local/lib/python3.10/dist-packages/absl/testing/absltest.py", line 2689 in run_tests
  File "/usr/local/lib/python3.10/dist-packages/absl/testing/absltest.py", line 2234 in main_function
  File "/usr/local/lib/python3.10/dist-packages/absl/app.py", line 254 in _run_main
  File "/usr/local/lib/python3.10/dist-packages/absl/app.py", line 308 in run
  File "/usr/local/lib/python3.10/dist-packages/absl/testing/absltest.py", line 2236 in _run_in_app
  File "/usr/local/lib/python3.10/dist-packages/absl/testing/absltest.py", line 2131 in main
  File "/root/.cache/bazel/_bazel_root/5b63f27bc35a3d0572c069ebf1768159/sandbox/linux-sandbox/8748/execroot/com_google_gematria/bazel-out/k8-fastbuild/bin/gematria/datasets/pipelines/compile_modules_lib_test.runfiles/com_google_gematria/gematria/datasets/pipelines/compile_modules_lib_test.py", line 142 in <module>

Extension modules: google.protobuf.pyext._message, google3.net.proto2.python.internal.cpp._message, apache_beam.coders.stream, grpc._cython.cygrpc, apache_beam.utils.windowed_value, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, fastavro._logical_readers, fastavro._schema, zstandard.backend_c, fastavro._read, fastavro._logical_writers, fastavro._validation, fastavro._write, pyarrow.lib, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.strptime, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.lib, pyarrow._compute, pandas._libs.ops, pandas._libs.hashing, pandas._libs.arrays, pandas._libs.tslib, pandas._libs.sparse, pandas._libs.internals, pandas._libs.indexing, pandas._libs.index, pandas._libs.writers, pandas._libs.join, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.groupby, pandas._libs.json, pandas._libs.parsers, pandas._libs.testing, apache_beam.coders.coder_impl, apache_beam.transforms.cy_dataflow_distribution_counter, apache_beam.transforms.cy_combiners, charset_normalizer.md, apache_beam.utils.counters, apache_beam.runners.common, apache_beam.transforms.stats, apache_beam.metrics.cells, apache_beam.runners.worker.statesampler_fast, apache_beam.metrics.execution, bson._cbson, pymongo._cmessage, pyarrow._parquet, pyarrow._fs, pyarrow._azurefs, pyarrow._hdfs, pyarrow._gcsfs, pyarrow._s3fs, crcmod._crcfunext, regex._regex, apache_beam.runners.worker.opcounters, apache_beam.runners.worker.operations (total: 89)

The main difference between the two setups seems to be that without any proto dependency defined in bazel, we use the system-installed protobuf version, whereas when we have a proto dependency specified, we use the bazel-managed protobuf version and automatically use the cpp backend. I have not been able to see if the issue reproduces with the system installed protobuf with the cpp backend as it is not installed by default.