(cuda12_1) root@I19359398490090128f:/hy-tmp# cd fastllm-master/
(cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master# mkdir build
(cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master# cd build
(cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master/build# cmake .. -DUSE_CUDA=ON
-- The CXX compiler identification is GNU 9.4.0
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- USE_CUDA: ON
-- USE_TFACC: OFF
-- For legacy CUDA GPUs: OFF
-- PYTHON_API: OFF
-- BUILD_CLI: OFF
-- USE_SENTENCEPIECE: OFF
-- USE_IVCOREX: OFF
-- CMAKE_CXX_FLAGS: -pthread --std=c++17 -O2 -march=native
-- The CUDA compiler identification is NVIDIA 12.1.105
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc -- works
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Configuring done
-- Generating done
-- Build files have been written to: /hy-tmp/fastllm-master/build
(cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master/build# make -j
Scanning dependencies of target fastllm
Scanning dependencies of target fastllm_tools
[ 1%] Building CXX object CMakeFiles/fastllm.dir/src/fastllm.cpp.o
[ 3%] Building CXX object CMakeFiles/fastllm.dir/src/models/moss.cpp.o
[ 5%] Building CXX object CMakeFiles/fastllm.dir/src/models/llama.cpp.o
[ 6%] Building CXX object CMakeFiles/fastllm.dir/src/model.cpp.o
[ 10%] Building CXX object CMakeFiles/fastllm_tools.dir/src/executor.cpp.o
[ 10%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/chatglm.cpp.o
[ 12%] Building CXX object CMakeFiles/fastllm.dir/src/device.cpp.o
[ 13%] Building CXX object CMakeFiles/fastllm_tools.dir/src/fastllm.cpp.o
[ 15%] Building CXX object CMakeFiles/fastllm.dir/src/models/chatglm.cpp.o
[ 18%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cpu/cpudevicebatch.cpp.o
[ 18%] Building CXX object CMakeFiles/fastllm_tools.dir/src/model.cpp.o
[ 20%] Building CXX object CMakeFiles/fastllm.dir/src/executor.cpp.o
[ 22%] Building CXX object CMakeFiles/fastllm.dir/src/models/glm.cpp.o
[ 24%] Building CXX object CMakeFiles/fastllm_tools.dir/src/device.cpp.o
[ 25%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/llama.cpp.o
[ 27%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cpu/cpudevice.cpp.o
[ 31%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/moss.cpp.o
[ 31%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cpu/cpudevicebatch.cpp.o
[ 32%] Building CXX object CMakeFiles/fastllm.dir/src/models/deepseekv2.cpp.o
[ 34%] Building CXX object CMakeFiles/fastllm.dir/src/models/basellm.cpp.o
[ 36%] Building CXX object CMakeFiles/fastllm.dir/src/template.cpp.o
[ 37%] Building CXX object CMakeFiles/fastllm.dir/src/models/minicpm.cpp.o
[ 39%] Building CXX object CMakeFiles/fastllm_tools.dir/src/template.cpp.o
[ 43%] Building CXX object CMakeFiles/fastllm.dir/src/models/bert.cpp.o
[ 43%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cpu/cpudevice.cpp.o
[ 46%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/basellm.cpp.o
[ 46%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/glm.cpp.o
[ 48%] Building CXX object CMakeFiles/fastllm.dir/src/models/qwen.cpp.o
[ 50%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/qwen.cpp.o
[ 51%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cuda/cudadevicebatch.cpp.o
[ 55%] Building CXX object CMakeFiles/fastllm.dir/src/models/internlm2.cpp.o
[ 55%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/minicpm.cpp.o
[ 60%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/internlm2.cpp.o
[ 60%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/bert.cpp.o
[ 60%] Building CXX object CMakeFiles/fastllm.dir/src/models/moe.cpp.o
[ 62%] Building CXX object CMakeFiles/fastllm_tools.dir/third_party/json11/json11.cpp.o
[ 63%] Building CUDA object CMakeFiles/fastllm.dir/src/devices/cuda/fastllm-cuda.cu.o
[ 68%] Building CXX object CMakeFiles/fastllm_tools.dir/tools/src/pytools.cpp.o
[ 68%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/moe.cpp.o
[ 68%] Building CUDA object CMakeFiles/fastllm_tools.dir/src/devices/cuda/fastllm-cuda.cu.o
[ 74%] Building CXX object CMakeFiles/fastllm.dir/third_party/json11/json11.cpp.o
[ 74%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/deepseekv2.cpp.o
[ 74%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cuda/cudadevicebatch.cpp.o
[ 77%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cuda/cudadevice.cpp.o
[ 77%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cuda/cudadevice.cpp.o
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1032): warning #177-D: variable "baseB" was declared but never referenced
const uint8_t baseB = B + p m;
^
Remark: The warnings can be suppressed with "-diag-suppress "
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1023): warning #177-D: variable "regA" was declared but never referenced
float4 regA;
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1024): warning #177-D: variable "regB" was declared but never referenced
union_char4 regB;
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "hsub" is undefined
output[i] = hexp(hsub(input[i], maxV));
^
detected during instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "hexp" is undefined
output[i] = hexp(__hsub(input[i], maxV));
^
detected during instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined
output[i] = hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=1]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined
output[i] = hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=8]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=8]" at line 2939
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined
output[i] = hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=64]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=64]" at line 2941
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined
output[i] = hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=256]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=256]" at line 2943
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(55): warning #177-D: variable "ST128_FP16_COUNT" was declared but never referenced
const size_t ST128_FP16_COUNT = 8;
^
12 errors detected in the compilation of "/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu".
make[2]: [CMakeFiles/fastllm_tools.dir/build.make:336: CMakeFiles/fastllm_tools.dir/src/devices/cuda/fastllm-cuda.cu.o] Error 1
make[2]: Waiting for unfinished jobs....
function "__half::operator long long() const" (declared at line 247 of /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp)
function "__half::operator unsigned long long() const" (declared at line 250 of /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp)
function "__half::operator __nv_bool() const" (declared at line 254 of /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp)
b[idx] = __hmul(__hdiv(x, __hadd(__float2half(1.0), hexp(-x))), y);
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1032): warning #177-D: variable "baseB" was declared but never referenced
const uint8_t baseB = B + p m;
^
Remark: The warnings can be suppressed with "-diag-suppress "
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1023): warning #177-D: variable "regA" was declared but never referenced
float4 regA;
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1024): warning #177-D: variable "regB" was declared but never referenced
union_char4 regB;
^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "hsub" is undefined
output[i] = hexp(hsub(input[i], maxV));
^
detected during instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "hexp" is undefined
output[i] = hexp(__hsub(input[i], maxV));
^
detected during instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined
output[i] = hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=1]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined
output[i] = hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=8]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=8]" at line 2939
instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=1]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined
output[i] = hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=8]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=8]" at line 2939
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined
output[i] = hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=64]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=64]" at line 2941
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined
output[i] = hdiv(output[i], sdata[0]);
^
detected during:
instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=256]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=256]" at line 2943
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(55): warning #177-D: variable "ST128_FP16_COUNT" was declared but never referenced
const size_t ST128_FP16_COUNT = 8;
^
12 errors detected in the compilation of "/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu".
make[2]: [CMakeFiles/fastllm.dir/build.make:336: CMakeFiles/fastllm.dir/src/devices/cuda/fastllm-cuda.cu.o] Error 1
make[2]: Waiting for unfinished jobs....
make[1]: [CMakeFiles/Makefile2:279: CMakeFiles/fastllm.dir/all] Error 2
make[1]: Waiting for unfinished jobs....
make[1]: [CMakeFiles/Makefile2:90: CMakeFiles/fastllm_tools.dir/all] Error 2
make: [Makefile:84: all] Error 2
用的cuda12.1,make -j过程中报错,整体安装过程如下:
(cuda12_1) root@I19359398490090128f:/hy-tmp# cd fastllm-master/ (cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master# mkdir build (cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master# cd build (cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master/build# cmake .. -DUSE_CUDA=ON -- The CXX compiler identification is GNU 9.4.0 -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- USE_CUDA: ON -- USE_TFACC: OFF -- For legacy CUDA GPUs: OFF -- PYTHON_API: OFF -- BUILD_CLI: OFF -- USE_SENTENCEPIECE: OFF -- USE_IVCOREX: OFF -- CMAKE_CXX_FLAGS: -pthread --std=c++17 -O2 -march=native -- The CUDA compiler identification is NVIDIA 12.1.105 -- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc -- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc -- works -- Detecting CUDA compiler ABI info -- Detecting CUDA compiler ABI info - done -- Configuring done -- Generating done -- Build files have been written to: /hy-tmp/fastllm-master/build (cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master/build# make -j Scanning dependencies of target fastllm Scanning dependencies of target fastllm_tools [ 1%] Building CXX object CMakeFiles/fastllm.dir/src/fastllm.cpp.o [ 3%] Building CXX object CMakeFiles/fastllm.dir/src/models/moss.cpp.o [ 5%] Building CXX object CMakeFiles/fastllm.dir/src/models/llama.cpp.o [ 6%] Building CXX object CMakeFiles/fastllm.dir/src/model.cpp.o [ 10%] Building CXX object CMakeFiles/fastllm_tools.dir/src/executor.cpp.o [ 10%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/chatglm.cpp.o [ 12%] Building CXX object CMakeFiles/fastllm.dir/src/device.cpp.o [ 13%] Building CXX object CMakeFiles/fastllm_tools.dir/src/fastllm.cpp.o [ 15%] Building CXX object CMakeFiles/fastllm.dir/src/models/chatglm.cpp.o [ 18%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cpu/cpudevicebatch.cpp.o [ 18%] Building CXX object CMakeFiles/fastllm_tools.dir/src/model.cpp.o [ 20%] Building CXX object CMakeFiles/fastllm.dir/src/executor.cpp.o [ 22%] Building CXX object CMakeFiles/fastllm.dir/src/models/glm.cpp.o [ 24%] Building CXX object CMakeFiles/fastllm_tools.dir/src/device.cpp.o [ 25%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/llama.cpp.o [ 27%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cpu/cpudevice.cpp.o [ 31%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/moss.cpp.o [ 31%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cpu/cpudevicebatch.cpp.o [ 32%] Building CXX object CMakeFiles/fastllm.dir/src/models/deepseekv2.cpp.o [ 34%] Building CXX object CMakeFiles/fastllm.dir/src/models/basellm.cpp.o [ 36%] Building CXX object CMakeFiles/fastllm.dir/src/template.cpp.o [ 37%] Building CXX object CMakeFiles/fastllm.dir/src/models/minicpm.cpp.o [ 39%] Building CXX object CMakeFiles/fastllm_tools.dir/src/template.cpp.o [ 43%] Building CXX object CMakeFiles/fastllm.dir/src/models/bert.cpp.o [ 43%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cpu/cpudevice.cpp.o [ 46%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/basellm.cpp.o [ 46%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/glm.cpp.o [ 48%] Building CXX object CMakeFiles/fastllm.dir/src/models/qwen.cpp.o [ 50%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/qwen.cpp.o [ 51%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cuda/cudadevicebatch.cpp.o [ 55%] Building CXX object CMakeFiles/fastllm.dir/src/models/internlm2.cpp.o [ 55%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/minicpm.cpp.o [ 60%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/internlm2.cpp.o [ 60%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/bert.cpp.o [ 60%] Building CXX object CMakeFiles/fastllm.dir/src/models/moe.cpp.o [ 62%] Building CXX object CMakeFiles/fastllm_tools.dir/third_party/json11/json11.cpp.o [ 63%] Building CUDA object CMakeFiles/fastllm.dir/src/devices/cuda/fastllm-cuda.cu.o [ 68%] Building CXX object CMakeFiles/fastllm_tools.dir/tools/src/pytools.cpp.o [ 68%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/moe.cpp.o [ 68%] Building CUDA object CMakeFiles/fastllm_tools.dir/src/devices/cuda/fastllm-cuda.cu.o [ 74%] Building CXX object CMakeFiles/fastllm.dir/third_party/json11/json11.cpp.o [ 74%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/deepseekv2.cpp.o [ 74%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cuda/cudadevicebatch.cpp.o [ 77%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cuda/cudadevice.cpp.o [ 77%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cuda/cudadevice.cpp.o
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "hdiv" is undefined b[idx] = hmul(hdiv(x, hadd(__float2half(1.0), hexp(-x))), y); ^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "hmul" is undefined b[idx] = hmul(hdiv(x, hadd(__float2half(1.0), hexp(-x))), y); ^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(267): error: identifier "hmul" is undefined b[idx] = hmul(a[idx], v); ^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(295): error: identifier "hmul" is undefined a[idx] = hadd(a[idx], __hmul(b[idx], alpha)); ^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1032): warning #177-D: variable "baseB" was declared but never referenced const uint8_t baseB = B + p m; ^
Remark: The warnings can be suppressed with "-diag-suppress"
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1023): warning #177-D: variable "regA" was declared but never referenced float4 regA; ^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1024): warning #177-D: variable "regB" was declared but never referenced union_char4 regB; ^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "hsub" is undefined output[i] = hexp(hsub(input[i], maxV)); ^ detected during instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "hexp" is undefined output[i] = hexp(__hsub(input[i], maxV)); ^ detected during instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined output[i] = hdiv(output[i], sdata[0]); ^ detected during: instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=1]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined output[i] = hdiv(output[i], sdata[0]); ^ detected during: instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=8]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=8]" at line 2939
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined output[i] = hdiv(output[i], sdata[0]); ^ detected during: instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=64]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=64]" at line 2941
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined output[i] = hdiv(output[i], sdata[0]); ^ detected during: instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=256]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=256]" at line 2943
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(55): warning #177-D: variable "ST128_FP16_COUNT" was declared but never referenced const size_t ST128_FP16_COUNT = 8; ^
12 errors detected in the compilation of "/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu". make[2]: [CMakeFiles/fastllm_tools.dir/build.make:336: CMakeFiles/fastllm_tools.dir/src/devices/cuda/fastllm-cuda.cu.o] Error 1 make[2]: Waiting for unfinished jobs....
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "hexp" is undefined b[idx] = hmul(hdiv(x, hadd(float2half(1.0), hexp(-x))), y); /hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "hexp" is undefined b[idx] = hmul(hdiv(x, hadd(float2half(1.0), hexp(-x))), y); ^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "hdiv" is undefined b[idx] = hmul(hdiv(x, hadd(__float2half(1.0), hexp(-x))), y); ^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "hmul" is undefined b[idx] = hmul(hdiv(x, hadd(__float2half(1.0), hexp(-x))), y); ^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(267): error: identifier "hmul" is undefined b[idx] = hmul(a[idx], v); b[idx] = __hmul(a[idx], v); ^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(295): error: identifier "hmul" is undefined a[idx] = hadd(a[idx], __hmul(b[idx], alpha)); ^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1032): warning #177-D: variable "baseB" was declared but never referenced const uint8_t baseB = B + p m; ^
Remark: The warnings can be suppressed with "-diag-suppress"
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1023): warning #177-D: variable "regA" was declared but never referenced float4 regA; ^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1024): warning #177-D: variable "regB" was declared but never referenced union_char4 regB; ^
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "hsub" is undefined output[i] = hexp(hsub(input[i], maxV)); ^ detected during instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "hexp" is undefined output[i] = hexp(__hsub(input[i], maxV)); ^ detected during instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined output[i] = hdiv(output[i], sdata[0]); ^ detected during: instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=1]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined output[i] = hdiv(output[i], sdata[0]); ^ detected during: instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=8]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=8]" at line 2939
instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=1]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined output[i] = hdiv(output[i], sdata[0]); ^ detected during: instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=8]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=8]" at line 2939
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined output[i] = hdiv(output[i], sdata[0]); ^ detected during: instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=64]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=64]" at line 2941
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined output[i] = hdiv(output[i], sdata[0]); ^ detected during: instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=256]" at line 637
instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=256]" at line 2943
/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(55): warning #177-D: variable "ST128_FP16_COUNT" was declared but never referenced const size_t ST128_FP16_COUNT = 8; ^
12 errors detected in the compilation of "/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu". make[2]: [CMakeFiles/fastllm.dir/build.make:336: CMakeFiles/fastllm.dir/src/devices/cuda/fastllm-cuda.cu.o] Error 1 make[2]: Waiting for unfinished jobs.... make[1]: [CMakeFiles/Makefile2:279: CMakeFiles/fastllm.dir/all] Error 2 make[1]: Waiting for unfinished jobs.... make[1]: [CMakeFiles/Makefile2:90: CMakeFiles/fastllm_tools.dir/all] Error 2 make: [Makefile:84: all] Error 2