google-deepmind / launchpad

Apache License 2.0
310 stars 35 forks source link

Issue with building LaunchPad with TensorFlow 2.14 #44

Open ethanluoyc opened 11 months ago

ethanluoyc commented 11 months ago

Hi,

When I try to build LaunchPad from source to support tensorflow 2.14 I get some new errors. This was not a problem when building against TensorFlow 2.13.

Traceback (most recent call last): File "", line 1, in File "/home/yicheng/projects/launchpad/.venv/lib/python3.9/site-packages/tensorflow/init.py", line 38, in from tensorflow.python.tools import module_util as _module_util File "/home/yicheng/projects/launchpad/.venv/lib/python3.9/site-packages/tensorflow/python/init.py", line 42, in from tensorflow.python.saved_model import saved_model File "/home/yicheng/projects/launchpad/.venv/lib/python3.9/site-packages/tensorflow/python/saved_model/saved_model.py", line 20, in from tensorflow.python.saved_model import builder File "/home/yicheng/projects/launchpad/.venv/lib/python3.9/site-packages/tensorflow/python/saved_model/builder.py", line 23, in from tensorflow.python.saved_model.builder_impl import _SavedModelBuilder File "/home/yicheng/projects/launchpad/.venv/lib/python3.9/site-packages/tensorflow/python/saved_model/builder_impl.py", line 27, in from tensorflow.python.framework import ops File "/home/yicheng/projects/launchpad/.venv/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 43, in from tensorflow.python.client import pywrap_tf_session File "/home/yicheng/projects/launchpad/.venv/lib/python3.9/site-packages/tensorflow/python/client/pywrap_tf_session.py", line 25, in from tensorflow.python.util import tf_stack File "/home/yicheng/projects/launchpad/.venv/lib/python3.9/site-packages/tensorflow/python/util/tf_stack.py", line 22, in from tensorflow.python.util import _tf_stack ImportError: generic_type: cannot initialize type "StatusCode": an object with that name is already defined

Looks like when using TensorFlow 2.14, importing both TensorFlow and Launchpad at the same time (order doesn't matter) causes the above problem).

ethanluoyc commented 10 months ago

We have identified a potential cause of the problem with TF 2.14.0. See the detailed discussion at https://github.com/pybind/pybind11_abseil/issues/12.

In conclusion, there are some ABI incompatibilities that may be interfering the some behavior in pybind11_abseil as Launchpad is currently built with the gcc toolchain in //third_party/toolchains while tensorflow>=2.13.0 now builds with clang. This wasn't an issue even though tensorflow 2.13.* is already built with clang is because the incompatible part is only added to the TensorFlow python package in 2.14.0.

The solution may be to upgrade the manylinux2014 config to use the clang in the tensorflow/build Docker image. I am not very familiar with Bazel and configuring the toolchain is quite difficult after some experimentation. Hopefully, someone from the Launchpad team can come in and fix this :)