ztxz16 / fastllm

纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行
Apache License 2.0
3.2k stars 322 forks source link

make -j过程中报错 #459

Open AIlaowong opened 1 month ago

AIlaowong commented 1 month ago

用的cuda12.1,make -j过程中报错,整体安装过程如下:

(cuda12_1) root@I19359398490090128f:/hy-tmp# cd fastllm-master/ (cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master# mkdir build (cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master# cd build (cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master/build# cmake .. -DUSE_CUDA=ON -- The CXX compiler identification is GNU 9.4.0 -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- USE_CUDA: ON -- USE_TFACC: OFF -- For legacy CUDA GPUs: OFF -- PYTHON_API: OFF -- BUILD_CLI: OFF -- USE_SENTENCEPIECE: OFF -- USE_IVCOREX: OFF -- CMAKE_CXX_FLAGS: -pthread --std=c++17 -O2 -march=native -- The CUDA compiler identification is NVIDIA 12.1.105 -- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc -- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc -- works -- Detecting CUDA compiler ABI info -- Detecting CUDA compiler ABI info - done -- Configuring done -- Generating done -- Build files have been written to: /hy-tmp/fastllm-master/build (cuda12_1) root@I19359398490090128f:/hy-tmp/fastllm-master/build# make -j Scanning dependencies of target fastllm Scanning dependencies of target fastllm_tools [ 1%] Building CXX object CMakeFiles/fastllm.dir/src/fastllm.cpp.o [ 3%] Building CXX object CMakeFiles/fastllm.dir/src/models/moss.cpp.o [ 5%] Building CXX object CMakeFiles/fastllm.dir/src/models/llama.cpp.o [ 6%] Building CXX object CMakeFiles/fastllm.dir/src/model.cpp.o [ 10%] Building CXX object CMakeFiles/fastllm_tools.dir/src/executor.cpp.o [ 10%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/chatglm.cpp.o [ 12%] Building CXX object CMakeFiles/fastllm.dir/src/device.cpp.o [ 13%] Building CXX object CMakeFiles/fastllm_tools.dir/src/fastllm.cpp.o [ 15%] Building CXX object CMakeFiles/fastllm.dir/src/models/chatglm.cpp.o [ 18%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cpu/cpudevicebatch.cpp.o [ 18%] Building CXX object CMakeFiles/fastllm_tools.dir/src/model.cpp.o [ 20%] Building CXX object CMakeFiles/fastllm.dir/src/executor.cpp.o [ 22%] Building CXX object CMakeFiles/fastllm.dir/src/models/glm.cpp.o [ 24%] Building CXX object CMakeFiles/fastllm_tools.dir/src/device.cpp.o [ 25%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/llama.cpp.o [ 27%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cpu/cpudevice.cpp.o [ 31%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/moss.cpp.o [ 31%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cpu/cpudevicebatch.cpp.o [ 32%] Building CXX object CMakeFiles/fastllm.dir/src/models/deepseekv2.cpp.o [ 34%] Building CXX object CMakeFiles/fastllm.dir/src/models/basellm.cpp.o [ 36%] Building CXX object CMakeFiles/fastllm.dir/src/template.cpp.o [ 37%] Building CXX object CMakeFiles/fastllm.dir/src/models/minicpm.cpp.o [ 39%] Building CXX object CMakeFiles/fastllm_tools.dir/src/template.cpp.o [ 43%] Building CXX object CMakeFiles/fastllm.dir/src/models/bert.cpp.o [ 43%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cpu/cpudevice.cpp.o [ 46%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/basellm.cpp.o [ 46%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/glm.cpp.o [ 48%] Building CXX object CMakeFiles/fastllm.dir/src/models/qwen.cpp.o [ 50%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/qwen.cpp.o [ 51%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cuda/cudadevicebatch.cpp.o [ 55%] Building CXX object CMakeFiles/fastllm.dir/src/models/internlm2.cpp.o [ 55%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/minicpm.cpp.o [ 60%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/internlm2.cpp.o [ 60%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/bert.cpp.o [ 60%] Building CXX object CMakeFiles/fastllm.dir/src/models/moe.cpp.o [ 62%] Building CXX object CMakeFiles/fastllm_tools.dir/third_party/json11/json11.cpp.o [ 63%] Building CUDA object CMakeFiles/fastllm.dir/src/devices/cuda/fastllm-cuda.cu.o [ 68%] Building CXX object CMakeFiles/fastllm_tools.dir/tools/src/pytools.cpp.o [ 68%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/moe.cpp.o [ 68%] Building CUDA object CMakeFiles/fastllm_tools.dir/src/devices/cuda/fastllm-cuda.cu.o [ 74%] Building CXX object CMakeFiles/fastllm.dir/third_party/json11/json11.cpp.o [ 74%] Building CXX object CMakeFiles/fastllm_tools.dir/src/models/deepseekv2.cpp.o [ 74%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cuda/cudadevicebatch.cpp.o [ 77%] Building CXX object CMakeFiles/fastllm.dir/src/devices/cuda/cudadevice.cpp.o [ 77%] Building CXX object CMakeFiles/fastllm_tools.dir/src/devices/cuda/cudadevice.cpp.o

                                                          ^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "hdiv" is undefined b[idx] = hmul(hdiv(x, hadd(__float2half(1.0), hexp(-x))), y); ^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "hmul" is undefined b[idx] = hmul(hdiv(x, hadd(__float2half(1.0), hexp(-x))), y); ^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(267): error: identifier "hmul" is undefined b[idx] = hmul(a[idx], v); ^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(295): error: identifier "hmul" is undefined a[idx] = hadd(a[idx], __hmul(b[idx], alpha)); ^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1032): warning #177-D: variable "baseB" was declared but never referenced const uint8_t baseB = B + p m; ^

Remark: The warnings can be suppressed with "-diag-suppress "

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1023): warning #177-D: variable "regA" was declared but never referenced float4 regA; ^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1024): warning #177-D: variable "regB" was declared but never referenced union_char4 regB; ^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "hsub" is undefined output[i] = hexp(hsub(input[i], maxV)); ^ detected during instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "hexp" is undefined output[i] = hexp(__hsub(input[i], maxV)); ^ detected during instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined output[i] = hdiv(output[i], sdata[0]); ^ detected during: instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=1]" at line 637 instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined output[i] = hdiv(output[i], sdata[0]); ^ detected during: instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=8]" at line 637 instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=8]" at line 2939

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined output[i] = hdiv(output[i], sdata[0]); ^ detected during: instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=64]" at line 637 instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=64]" at line 2941

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined output[i] = hdiv(output[i], sdata[0]); ^ detected during: instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=256]" at line 637 instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=256]" at line 2943

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(55): warning #177-D: variable "ST128_FP16_COUNT" was declared but never referenced const size_t ST128_FP16_COUNT = 8; ^

12 errors detected in the compilation of "/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu". make[2]: [CMakeFiles/fastllm_tools.dir/build.make:336: CMakeFiles/fastllm_tools.dir/src/devices/cuda/fastllm-cuda.cu.o] Error 1 make[2]: Waiting for unfinished jobs....

        function "__half::operator long long() const" (declared at line 247 of /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp)
        function "__half::operator unsigned long long() const" (declared at line 250 of /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp)
        function "__half::operator __nv_bool() const" (declared at line 254 of /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_fp16.hpp)
      b[idx] = __hmul(__hdiv(x, __hadd(__float2half(1.0), hexp(-x))), y);
                                                                ^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "hexp" is undefined b[idx] = hmul(hdiv(x, hadd(float2half(1.0), hexp(-x))), y); /hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "hexp" is undefined b[idx] = hmul(hdiv(x, hadd(float2half(1.0), hexp(-x))), y); ^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "hdiv" is undefined b[idx] = hmul(hdiv(x, hadd(__float2half(1.0), hexp(-x))), y); ^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(249): error: identifier "hmul" is undefined b[idx] = hmul(hdiv(x, hadd(__float2half(1.0), hexp(-x))), y); ^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(267): error: identifier "hmul" is undefined b[idx] = hmul(a[idx], v); b[idx] = __hmul(a[idx], v); ^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(295): error: identifier "hmul" is undefined a[idx] = hadd(a[idx], __hmul(b[idx], alpha)); ^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1032): warning #177-D: variable "baseB" was declared but never referenced const uint8_t baseB = B + p m; ^

Remark: The warnings can be suppressed with "-diag-suppress "

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1023): warning #177-D: variable "regA" was declared but never referenced float4 regA; ^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(1024): warning #177-D: variable "regB" was declared but never referenced union_char4 regB; ^

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "hsub" is undefined output[i] = hexp(hsub(input[i], maxV)); ^ detected during instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(603): error: identifier "hexp" is undefined output[i] = hexp(__hsub(input[i], maxV)); ^ detected during instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined output[i] = hdiv(output[i], sdata[0]); ^ detected during: instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=1]" at line 637 instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined output[i] = hdiv(output[i], sdata[0]); ^ detected during: instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=8]" at line 637 instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=8]" at line 2939 instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=1]" at line 637 instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=1]" at line 2937

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "__hdiv" is undefined

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined output[i] = hdiv(output[i], sdata[0]); ^ detected during: instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=8]" at line 637 instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=8]" at line 2939

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined output[i] = hdiv(output[i], sdata[0]); ^ detected during: instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=64]" at line 637 instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=64]" at line 2941

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(623): error: identifier "hdiv" is undefined output[i] = hdiv(output[i], sdata[0]); ^ detected during: instantiation of "void FastllmSoftmaxKernelInner1Func(half , half , int) [with THREAD_PER_BLOCK=256]" at line 637 instantiation of "void FastllmSoftmaxKernelInner1(half , half , int, int) [with THREAD_PER_BLOCK=256]" at line 2943

/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu(55): warning #177-D: variable "ST128_FP16_COUNT" was declared but never referenced const size_t ST128_FP16_COUNT = 8; ^

12 errors detected in the compilation of "/hy-tmp/fastllm-master/src/devices/cuda/fastllm-cuda.cu". make[2]: [CMakeFiles/fastllm.dir/build.make:336: CMakeFiles/fastllm.dir/src/devices/cuda/fastllm-cuda.cu.o] Error 1 make[2]: Waiting for unfinished jobs.... make[1]: [CMakeFiles/Makefile2:279: CMakeFiles/fastllm.dir/all] Error 2 make[1]: Waiting for unfinished jobs.... make[1]: [CMakeFiles/Makefile2:90: CMakeFiles/fastllm_tools.dir/all] Error 2 make: [Makefile:84: all] Error 2

ztxz16 commented 1 month ago

一般是cmake没有识别到cuda架构,需要改一下CMakeLists.txt里面的CMAKE_CUDA_ARCHITECTURES,改成显卡对应的算力(一般是80, 90这样)

zhang-xh95 commented 1 week ago

解决了嘛 遇到了同样的问题 我也是cuda12.1

zhang-xh95 commented 1 week ago

我升级了cmake 重新编译就ok了