ROCm / tensorflow-upstream

TensorFlow ROCm port
https://tensorflow.org
Apache License 2.0
686 stars 93 forks source link

Compiling from source leads to ImportError with undefined symbol #216

Closed rfgil closed 5 years ago

rfgil commented 5 years ago

System information

The issue I'm going to describe, I think qualifies as a bug and even though I've tried using StackOverflow first, I got no activity there regarding this.

I am unable to build tensorflow-rocm from sources in the system described above. I've tried different branches (r1.8-rocm, r1.8-rocm-centos, r1.11-rocm, ...) and all fail with the same error:

ImportError: /homelocal/rfgillocal/.cache/bazel/_bazel_rfgillocal/ea4f3f9ea1629f214818602553d90693/execroot/org_tensorflow/bazel-out/host/bin/tensorflow/tools/api/generator/create_python_api.runfiles/org_tensorflow/tensorflow/python/_pywrap_tensorflow_internal.so: undefined symbol: _ZN10tensorflow7functor13BinaryFunctorIN5Eigen9GpuDeviceENS0_13greater_equalIhEELi1ELb0EE5RightERKS3_NS2_9TensorMapINS2_6TensorIbLi1ELi1ElEELi16ENS2_11MakePointerEEENS9_INSA_IKhLi1ELi1ElEELi16ESC_EENS9_INS2_15TensorFixedSizeISE_NS2_5SizesIIEEELi1ElEELi16ESC_EEPb

I tried making up sense of that undefined symbol string and I think it is related with Eigen and the patches applied to it by ROCm. However I'm unable to find the real problem or at least make significant progress towards a solution.

I've also tried replacing the eigen version with a newer one, which also led to the same problem.

Here is the full stack I'm getting:

ERROR: /homelocal/rfgillocal/src/tensorflow-upstream/tensorflow/tools/api/generator/BUILD:27:1: Executing genrule //tensorflow/tools/api/generator:python_api_gen failed (Exit 1): bash failed: error executing command 
  (cd /homelocal/rfgillocal/.cache/bazel/_bazel_rfgillocal/ea4f3f9ea1629f214818602553d90693/execroot/org_tensorflow && \
  exec env - \
    LD_LIBRARY_PATH=/homelocal/rfgillocal/rocm/lib:/homelocal/rfgillocal/rocm/miopen/lib:/homelocal/rfgillocal/rocm/rocblas/lib:/homelocal/rfgillocal/rocm/rocfft/lib:/homelocal/rfgillocal/rocm/hiprand/lib:/homelocal/rfgillocal/local/lib:/usr/lib64:/usr/lib \
    PATH=/homelocal/rfgillocal/local/bin:/homelocal/rfgillocal/python_env/bin:/opt/rocm/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin \
    PYTHON_BIN_PATH=/homelocal/rfgillocal/python_env/bin/python \
    PYTHON_LIB_PATH=/homelocal/rfgillocal/python_env/lib/python2.7/site-packages \
    TF_DOWNLOAD_CLANG=0 \
    TF_NEED_CUDA=0 \
    TF_NEED_OPENCL_SYCL=0 \
    TF_NEED_ROCM=1 \
  /bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; bazel-out/host/bin/tensorflow/tools/api/generator/create_python_api bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/app/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/bitwise/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/compat/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/contrib/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/contrib/stat_summarizer/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/data/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/distributions/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/distributions/bijectors/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/errors/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/estimator/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/estimator/export/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/estimator/inputs/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/feature_column/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/gfile/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/graph_util/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/image/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/initializers/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/activations/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/applications/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/applications/densenet/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/applications/inception_resnet_v2/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/applications/inception_v3/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/applications/mobilenet/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/applications/nasnet/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/applications/resnet50/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/applications/vgg16/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/applications/vgg19/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/applications/xception/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/backend/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/callbacks/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/constraints/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/datasets/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/datasets/boston_housing/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/datasets/cifar10/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/datasets/cifar100/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/datasets/fashion_mnist/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/datasets/imdb/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/datasets/mnist/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/datasets/reuters/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/estimator/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/initializers/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/layers/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/losses/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/metrics/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/models/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/optimizers/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/preprocessing/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/preprocessing/image/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/preprocessing/sequence/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/preprocessing/text/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/regularizers/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/utils/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/wrappers/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/keras/wrappers/scikit_learn/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/layers/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/linalg/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/logging/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/losses/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/manip/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/math/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/metrics/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/nn/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/nn/rnn_cell/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/profiler/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/python_io/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/resource_loader/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/saved_model/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/saved_model/builder/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/saved_model/constants/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/saved_model/loader/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/saved_model/main_op/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/saved_model/signature_constants/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/saved_model/signature_def_utils/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/saved_model/tag_constants/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/saved_model/utils/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/sets/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/spectral/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/summary/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/sysconfig/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/test/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/train/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/train/queue_runner/__init__.py bazel-out/k8-opt/genfiles/tensorflow/tools/api/generator/api/user_ops/__init__.py')
Traceback (most recent call last):
  File "/homelocal/rfgillocal/.cache/bazel/_bazel_rfgillocal/ea4f3f9ea1629f214818602553d90693/execroot/org_tensorflow/bazel-out/host/bin/tensorflow/tools/api/generator/create_python_api.runfiles/org_tensorflow/tensorflow/tools/api/generator/create_python_api.py", line 26, in <module>
    from tensorflow.python.util import tf_decorator
  File "/homelocal/rfgillocal/.cache/bazel/_bazel_rfgillocal/ea4f3f9ea1629f214818602553d90693/execroot/org_tensorflow/bazel-out/host/bin/tensorflow/tools/api/generator/create_python_api.runfiles/org_tensorflow/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/homelocal/rfgillocal/.cache/bazel/_bazel_rfgillocal/ea4f3f9ea1629f214818602553d90693/execroot/org_tensorflow/bazel-out/host/bin/tensorflow/tools/api/generator/create_python_api.runfiles/org_tensorflow/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):    
  File "/homelocal/rfgillocal/.cache/bazel/_bazel_rfgillocal/ea4f3f9ea1629f214818602553d90693/execroot/org_tensorflow/bazel-out/host/bin/tensorflow/tools/api/generator/create_python_api.runfiles/org_tensorflow/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/homelocal/rfgillocal/.cache/bazel/_bazel_rfgillocal/ea4f3f9ea1629f214818602553d90693/execroot/org_tensorflow/bazel-out/host/bin/tensorflow/tools/api/generator/create_python_api.runfiles/org_tensorflow/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/homelocal/rfgillocal/.cache/bazel/_bazel_rfgillocal/ea4f3f9ea1629f214818602553d90693/execroot/org_tensorflow/bazel-out/host/bin/tensorflow/tools/api/generator/create_python_api.runfiles/org_tensorflow/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
ImportError: /homelocal/rfgillocal/.cache/bazel/_bazel_rfgillocal/ea4f3f9ea1629f214818602553d90693/execroot/org_tensorflow/bazel-out/host/bin/tensorflow/tools/api/generator/create_python_api.runfiles/org_tensorflow/tensorflow/python/_pywrap_tensorflow_internal.so: undefined symbol: _ZN10tensorflow7functor13BinaryFunctorIN5Eigen9GpuDeviceENS0_13greater_equalIhEELi1ELb0EE5RightERKS3_NS2_9TensorMapINS2_6TensorIbLi1ELi1ElEELi16ENS2_11MakePointerEEENS9_INSA_IKhLi1ELi1ElEELi16ESC_EENS9_INS2_15TensorFixedSizeISE_NS2_5SizesIIEEELi1ElEELi16ESC_EEPb

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.
Target //tensorflow/tools/pip_package:build_pip_package failed to build

Any help would be highly appreciated.

Thanks in advance!

whchung commented 5 years ago

@rfgil it's unclear to me why such symbol in Eigen is missing in your build. could you try build with the Dockerfile at:

https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/blob/develop-upstream/tensorflow/tools/ci_build/Dockerfile.rocm

whchung commented 5 years ago

@sunway513 I noticed in this ticket @rfgil is using CentOS instead Ubuntu. Could you share some insights building on CentOS?

sunway513 commented 5 years ago

@rfgil did you configured the devtoolset 7 dependancy on your system? Please read through the ROCm doc on how to configure the centos environment: https://github.com/RadeonOpenCompute/ROCm#centosrhel-7-both-74-and-75-support

rfgil commented 5 years ago

@rfgil it's unclear to me why such symbol in Eigen is missing in your build. could you try build with the Dockerfile at:

https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/blob/develop-upstream/tensorflow/tools/ci_build/Dockerfile.rocm

The Docker file won't allow me change the source files, which is important to me.

@rfgil did you configured the devtoolset 7 dependancy on your system? Please read through the ROCm doc on how to configure the centos environment: https://github.com/RadeonOpenCompute/ROCm#centosrhel-7-both-74-and-75-support

Everything was installed correctly. However, I wasn't enabling the devtoolset before compiling tensorflow which was causing the problem. I was able compile successfully after using the command:

scl enable devtoolset-7 bash

This might be a important step to be added to the instructions in the 'TensorFlow ROCm port: Building From Source' README (https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/blob/develop-upstream/rocm_docs/tensorflow-build-from-source.md). I am sorry for missing such a thing. What happened was I assumed I had ROCm correctly installed and ready to use on the shared machine I'm using, hence adding that line to the README file.

Thank you for your help!