tensorflow / models

Models and examples built with TensorFlow
Other
77.04k stars 45.77k forks source link

Error running tensorflow from docker container #10461

Open mgmike opened 2 years ago

mgmike commented 2 years ago

I am following the directions of the tf2.md doc. I built the docker image and ran the container successfully.

When running the 'test the integration' line,

python object_detection/builders/model_builder_tf2_test.py

I get the following error message:

Traceback (most recent call last): File "object_detection/builders/model_builder_tf2_test.py", line 22, in import tensorflow.compat.v1 as tf File "/home/tensorflow/.local/lib/python3.6/site-packages/tensorflow/init.py", line 438, in _ll.load_library(_main_dir) File "/home/tensorflow/.local/lib/python3.6/site-packages/tensorflow/python/framework/load_library.py", line 154, in load_library py_tf.TF_LoadLibrary(lib) tensorflow.python.framework.errors_impl.NotFoundError: /usr/local/lib/python3.6/dist-packages/tensorflow/core/kernels/libtfkernel_sobol_op.so: undefined symbol: _ZN10tensorflow8OpKernel11TraceStringEPNS_15OpKernelContextEb

I am running Ubuntu 18.04 with cuda 11.2, cudnn 8.1.1 and driver 460.106.00 for gtx 1080ti. I have not changed anything about the docker container so the tensorflow version is 2.6.2, and the python version is 3.6.9. I checked the tensorflow compatibility matrix and everything seems fine, except for my base gcc version which is 7.5.0. Could this be the issue? Either way, both included tensorflow version checks give the same issue.

python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

I have seen this issue a few times over stack overflow and tensorflow git issue. I tried removing the file from the stack overflow link, but that did nothing. The solution in the other git issue was to upgrade, but the docker container is using the highest tensorflow version for python 3.6.

sulc commented 2 years ago

Looks like (unresolved) issue #10439