NVIDIA / tensorflow

An Open Source Machine Learning Framework for Everyone
https://developer.nvidia.com/deep-learning-frameworks
Apache License 2.0
990 stars 152 forks source link

r1.15.5+nv21.10 cannot be built with --config=mkl #41

Open ziyuang opened 2 years ago

ziyuang commented 2 years ago

Please make sure that this is a build/installation issue. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:build_template

System information

Describe the problem Error message:

./tensorflow/core/kernels/quantization_utils.h:725:43:   required from here
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h:90:54: error: static assertion failed: Default executor instantiated with non-default device. You must #define EIGEN_USE_THREADS, EIGEN_USE_GPU or EIGEN_USE_SYCL before including Eigen headers.
   90 |   static_assert(std::is_same<Device, DefaultDevice>::value,
      |                                                      ^~~~~
Target //tensorflow/tools/pip_package:build_pip_package failed to build

Provide the exact sequence of commands / steps that you executed before running into the problem

Use the below Dockerfile

FROM nvidia/cuda:11.4.2-cudnn8-devel-ubuntu20.04
ENV TZ=Europe/London
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ >/etc/timezone && apt-get update && apt-get -y upgrade && apt-get install -y build-essential git git-lfs wget vim software-properties-common unzip python3-pip && update-alternatives --install /usr/bin/python python $(which python3) 10 && RUN pip install --upgrade numpy astor
WORKDIR /workdir
RUN BAZEL=bazel-0.26.1-installer-linux-x86_64.sh && wget https://github.com/bazelbuild/bazel/releases/download/0.26.1/${BAZEL} && chmod +x ${BAZEL} && ./${BAZEL} && git clone https://github.com/NVIDIA/cudnn-frontend.git && git clone --branch r1.15.5+nv21.10 --single-branch https://github.com/NVIDIA/tensorflow.git
WORKDIR /workdir/tensorflow
ENV TF_ENABLE_XLA=1 \
    TF_NEED_OPENCL_SYCL=0 \
    TF_NEED_ROCM=0 \
    TF_NEED_CUDA=1 \
    TF_NEED_TENSORRT=0 \
    TF_CUDA_VERSION=11 \
    TF_CUBLAS_VERSION=11 \
    TF_NCCL_VERSION=2 \
    TF_CUDNN_VERSION=8 \
    TF_CUDA_PATHS="/usr/include,/usr/lib/x86_64-linux-gnu,/usr/local/cuda/include,/usr/local/cuda/lib64,/usr/local/cuda/bin,/usr/local/cuda" \
    TF_CUDA_COMPUTE_CAPABILITIES=3.5,5.0,5.2,6.1,7.0,7.5,8.6 \
    CC_OPT_FLAGS="-march=sandybridge -mfma -mfpmath=both"
RUN PYTHON_BIN_PATH=$(which python) ./configure
RUN bazel build --config=opt --config=noaws --config=mkl --config=nogcp --config=nohdfs --config=noignite --config=nokafka //tensorflow/tools/pip_package:build_pip_package

Any other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

full log of the last RUN: https://1drv.ms/t/s!Ao-GP3hGG9a9gvF--rQYH7LUlYnh-w?e=rZfQ3h

ziyuang commented 2 years ago

Adding -DEIGEN_USE_THREADS to CC_OPT_FLAGS doesn't make the error gone.