NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT
Apache License 2.0
5.85k stars 892 forks source link

build failed with tf-op #701

Open jackzhou121 opened 1 year ago

jackzhou121 commented 1 year ago

Branch/Tag/Commit

v5.3

Docker Image Version

nvcr.io/nvidia/pytorch:22.12-py3

GPU name

T4

CUDA Driver

NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.8

Reproduced Steps

build with "cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_MULTI_GPU=ON -DBUILD_TF2=ON -DTF_PATH=/usr/local/lib/python3.8/dist-packages/tensorflow/ .." and make
error message:
/workspace/FasterTransformer/src/fastertransformer/kernels/disentangled_attention_kernels.cu(382): error: more than one operator "=" matches these operands:
            function "__nv_bfloat16::operator=(float)"
/usr/local/cuda/include/cuda_bf16.hpp(178): here
            function "__nv_bfloat16::operator=(double)"
/usr/local/cuda/include/cuda_bf16.hpp(181): here
            operand types are: __nv_bfloat16 = int
          detected during instantiation of "void fastertransformer::disentangled_attention_kernel<TDataType,tTileSize,tBlockDimY>(TDataType *, TDataType *, const TDataType *, const TDataType *, int32_t, int32_t, int32_t) [with TDataType=__nv_bfloat16, tTileSize=32, tBlockDimY=8]"
(407): here

/workspace/FasterTransformer/src/fastertransformer/kernels/disentangled_attention_kernels.cu(382): error: more than one operator "=" matches these operands:
            function "__nv_bfloat16::operator=(float)"
/usr/local/cuda/include/cuda_bf16.hpp(178): here
            function "__nv_bfloat16::operator=(double)"
/usr/local/cuda/include/cuda_bf16.hpp(181): here
            operand types are: __nv_bfloat16 = int
          detected during instantiation of "void fastertransformer::disentangled_attention_kernel<TDataType,tTileSize,tBlockDimY>(TDataType *, TDataType *, const TDataType *, const TDataType *, int32_t, int32_t, int32_t) [with TDataType=__nv_bfloat16, tTileSize=64, tBlockDimY=4]"
(407): here

2 errors detected in the compilation of "/workspace/FasterTransformer/src/fastertransformer/kernels/disentangled_attention_kernels.cu".
llsj14 commented 7 months ago

@jackzhou121 ,

I had same issues, and resolved them using following patch. This code casts the value into a double, ensuring that the compiler always chooses double operands for the "=" operator.

--- a/src/fastertransformer/kernels/disentangled_attention_kernels.cu
+++ b/src/fastertransformer/kernels/disentangled_attention_kernels.cu
@@ -379,7 +379,7 @@ __global__ void disentangled_attention_kernel(TDataType*       result,
 #ifdef ENABLE_BF16
         else if constexpr (std::is_same<TDataType, __nv_bfloat16>::value) {
             // bf16
-            res = __hadd(res0, __hadd(res1, T[threadIdx.x][ty + threadIdx.y]));
+            res = static_cast<double>(__hadd(res0, __hadd(res1, T[threadIdx.x][ty + threadIdx.y])));
         }
 #endif