pcg-mlp / KsanaLLM

Other
286 stars 30 forks source link

使用WSL2 编译过程中发生异常 error: ‘mha_varlen_fwd’ was not declared in this scope #21

Closed old-tomato closed 3 months ago

old-tomato commented 3 months ago

使用WSL2 编译过程中发生异常 /mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:243:7: error: there are no arguments to ‘mha_varlen_fwd’ that depend on a template parameter, so a declaration of ‘mha_varlen_fwd’ must be available [-fpermissive] 243 | mha_varlen_fwd(q_tmp_tensor, torch::reshape(k_tensor, {total_tokens, num_kv_heads, head_size}), | ^~~~~~ /mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:243:7: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) /mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp: In instantiation of ‘void ksana_llm::AttenVarlen(void, void, void, void, void, llm_kernels::nvidia::RotaryEmbeddingCuda&, int, int, int, int, int, int, int, int, bool, int, int, void, void, void, void, const std::optional<void>&, cudaStream_t) [with T = float; cudaStream_t = CUstream_st]’: /mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:270:1: required from here /mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:243:21: error: ‘mha_varlen_fwd’ was not declared in this scope 243 | mha_varlen_fwd(q_tmp_tensor, torch::reshape(k_tensor, {total_tokens, num_kv_heads, head_size}), | ~~~~^~~~~~~~~~~~~~~~~~~ 244 | torch::reshape(tt[2], {total_tokens, num_kv_heads, head_size}), out_tensor, | ~~~~~~~~~~~~~~~ 245 | seqlen_tensor.to(torch::kInt32), seqlen_tensor.to(torch::kInt32), seqused_k, alibi_slopes_tensor, | ~~~~~~~~~~~~~~~~~~~~~ 246 | max_tokens, max_tokens, 0.f, 1.0 / sqrt(head_size), false, is_causal, -1, -1, false, c10::nullopt); | ~~~~~~~~~~~~~~~~~~~~~~ /mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp: In instantiation of ‘void ksana_llm::AttenVarlen(void, void, void, void, void, llm_kernels::nvidia::RotaryEmbeddingCuda&, int, int, int, int, int, int, int, int, bool, int, int, void, void, void, void, const std::optional<void>&, cudaStream_t) [with T = __half; cudaStream_t = CUstream_st]’: /mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:271:1: required from here /mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:243:21: error: ‘mha_varlen_fwd’ was not declared in this scope /mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp: In instantiation of ‘void ksana_llm::AttenVarlen(void, void, void, void, void, llm_kernels::nvidia::RotaryEmbeddingCuda&, int, int, int, int, int, int, int, int, bool, int, int, void, void, void, void, const std::optional<void>&, cudaStream_t) [with T = __nv_bfloat16; cudaStream_t = CUstream_st*]’: /mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:273:1: required from here /mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:243:21: error: ‘mha_varlen_fwd’ was not declared in this scope make[2]: [src/ksana_llm/kernels/CMakeFiles/kernels.dir/build.make:76: src/ksana_llm/kernels/CMakeFiles/kernels.dir/nvidia/kernel_wrapper.cpp.o] Error 1 make[1]: [CMakeFiles/Makefile2:2511: src/ksana_llm/kernels/CMakeFiles/kernels.dir/all] Error 2

环境: python 3.10.13 gcc 11.4.0 Package Version


absl-py 2.1.0 aiohttp 3.9.5 aiosignal 1.3.1 annotated-types 0.7.0 anyio 4.4.0 async-timeout 4.0.3 attrs 23.2.0 certifi 2024.7.4 charset-normalizer 3.3.2 click 8.1.7 einops 0.8.0 exceptiongroup 1.2.2 fastapi 0.110.0 filelock 3.15.4 flash-attn 2.2.1 frozenlist 1.4.1 fsspec 2024.6.1 h11 0.14.0 huggingface-hub 0.24.3 idna 3.7 Jinja2 3.1.4 joblib 1.4.2 MarkupSafe 2.1.5 mpmath 1.3.0 msgpack 1.0.8 multidict 6.0.5 networkx 3.3 ninja 1.11.1.1 nltk 3.8.1 numpy 2.0.1 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 9.1.0.70 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.5.82 nvidia-nvtx-cu12 12.1.105 packaging 24.1 pillow 10.4.0 pip 24.0 protobuf 4.24.4 pydantic 2.8.2 pydantic_core 2.20.1 PyYAML 6.0.1 regex 2024.7.24 requests 2.32.3 rouge_score 0.1.2 safetensors 0.4.3 sentencepiece 0.2.0 setuptools 69.5.1 six 1.16.0 sniffio 1.3.1 starlette 0.36.3 sympy 1.13.1 tokenizers 0.15.2 torch 2.4.0 torchaudio 2.4.0 torchvision 0.19.0 tqdm 4.66.4 transformers 4.39.2 triton 3.0.0 typing_extensions 4.12.2 urllib3 2.2.2 uvicorn 0.29.0 wheel 0.43.0 yarl 1.9.4

nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Tue_Feb__7_19:32:13_PST_2023 Cuda compilation tools, release 12.1, V12.1.66 Build cuda_12.1.r12.1/compiler.32415258_0

cmake version 3.22.1

old-tomato commented 3 months ago

问题已经找到,是安装flash_attn问题,建议构建时flash_attn校验的时候加个异常抛出,否则会漏看