使用WSL2 编译过程中发生异常
/mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:243:7: error: there are no arguments to ‘mha_varlen_fwd’ that depend on a template parameter, so a declaration of ‘mha_varlen_fwd’ must be available [-fpermissive]
243 | mha_varlen_fwd(q_tmp_tensor, torch::reshape(k_tensor, {total_tokens, num_kv_heads, head_size}),
| ^~~~~~
/mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:243:7: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated)
/mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp: In instantiation of ‘void ksana_llm::AttenVarlen(void, void, void, void, void, llm_kernels::nvidia::RotaryEmbeddingCuda&, int, int, int, int, int, int, int, int, bool, int, int, void, void, void, void, const std::optional<void>&, cudaStream_t) [with T = float; cudaStream_t = CUstream_st]’:
/mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:270:1: required from here
/mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:243:21: error: ‘mha_varlen_fwd’ was not declared in this scope
243 | mha_varlen_fwd(q_tmp_tensor, torch::reshape(k_tensor, {total_tokens, num_kv_heads, head_size}),
| ~~~~^~~~~~~~~~~~~~~~~~~
244 | torch::reshape(tt[2], {total_tokens, num_kv_heads, head_size}), out_tensor,
| ~~~~~~~~~~~~~~~
245 | seqlen_tensor.to(torch::kInt32), seqlen_tensor.to(torch::kInt32), seqused_k, alibi_slopes_tensor,
| ~~~~~~~~~~~~~~~~~~~~~
246 | max_tokens, max_tokens, 0.f, 1.0 / sqrt(head_size), false, is_causal, -1, -1, false, c10::nullopt);
| ~~~~~~~~~~~~~~~~~~~~~~
/mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp: In instantiation of ‘void ksana_llm::AttenVarlen(void, void, void, void, void, llm_kernels::nvidia::RotaryEmbeddingCuda&, int, int, int, int, int, int, int, int, bool, int, int, void, void, void, void, const std::optional<void>&, cudaStream_t) [with T = __half; cudaStream_t = CUstream_st]’:
/mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:271:1: required from here
/mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:243:21: error: ‘mha_varlen_fwd’ was not declared in this scope
/mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp: In instantiation of ‘void ksana_llm::AttenVarlen(void, void, void, void, void, llm_kernels::nvidia::RotaryEmbeddingCuda&, int, int, int, int, int, int, int, int, bool, int, int, void, void, void, void, const std::optional<void>&, cudaStream_t) [with T = __nv_bfloat16; cudaStream_t = CUstream_st*]’:
/mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:273:1: required from here
/mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:243:21: error: ‘mha_varlen_fwd’ was not declared in this scope
make[2]: [src/ksana_llm/kernels/CMakeFiles/kernels.dir/build.make:76: src/ksana_llm/kernels/CMakeFiles/kernels.dir/nvidia/kernel_wrapper.cpp.o] Error 1
make[1]: [CMakeFiles/Makefile2:2511: src/ksana_llm/kernels/CMakeFiles/kernels.dir/all] Error 2
使用WSL2 编译过程中发生异常 /mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:243:7: error: there are no arguments to ‘mha_varlen_fwd’ that depend on a template parameter, so a declaration of ‘mha_varlen_fwd’ must be available [-fpermissive] 243 | mha_varlen_fwd(q_tmp_tensor, torch::reshape(k_tensor, {total_tokens, num_kv_heads, head_size}), | ^&, int, int, int, int, int, int, int, int, bool, int, int, void, void, void , void, const std::optional<void>&, cudaStream_t) [with T = float; cudaStream_t = CUstream_st]’:
/mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:270:1: required from here
/mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:243:21: error: ‘mha_varlen_fwd’ was not declared in this scope
243 | mha_varlen_fwd(q_tmp_tensor, torch::reshape(k_tensor, {total_tokens, num_kv_heads, head_size}),
| &, int, int, int, int, int, int, int, int, bool, int, int, void, void, void, void, const std::optional<void>&, cudaStream_t) [with T = __half; cudaStream_t = CUstream_st]’:
/mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:271:1: required from here
/mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:243:21: error: ‘mha_varlen_fwd’ was not declared in this scope
/mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp: In instantiation of ‘void ksana_llm::AttenVarlen(void, void, void, void, void, llm_kernels::nvidia::RotaryEmbeddingCuda&, int, int, int, int, int, int, int, int, bool, int, int, void, void, void , void, const std::optional<void>&, cudaStream_t) [with T = __nv_bfloat16; cudaStream_t = CUstream_st*]’:
/mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:273:1: required from here
/mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:243:21: error: ‘mha_varlen_fwd’ was not declared in this scope
make[2]: [src/ksana_llm/kernels/CMakeFiles/kernels.dir/build.make:76: src/ksana_llm/kernels/CMakeFiles/kernels.dir/nvidia/kernel_wrapper.cpp.o] Error 1
make[1]: [CMakeFiles/Makefile2:2511: src/ksana_llm/kernels/CMakeFiles/kernels.dir/all] Error 2
~~~~~ /mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp:243:7: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) /mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp: In instantiation of ‘void ksana_llm::AttenVarlen(void, void, void, void, void, llm_kernels::nvidia::RotaryEmbeddingCuda~~~~^~~~~~~~~~~~~~~~~~~ 244 | torch::reshape(tt[2], {total_tokens, num_kv_heads, head_size}), out_tensor, |~~~~~~~~~~~~~~~245 | seqlen_tensor.to(torch::kInt32), seqlen_tensor.to(torch::kInt32), seqused_k, alibi_slopes_tensor, |~~~~~~~~~~~~~~~~~~~~~ 246 | max_tokens, max_tokens, 0.f, 1.0 / sqrt(head_size), false, is_causal, -1, -1, false, c10::nullopt); |~~~~~~~~~~~~~~~~~~~~~~ /mnt/e/KsanaLLM/src/ksana_llm/kernels/nvidia/kernel_wrapper.cpp: In instantiation of ‘void ksana_llm::AttenVarlen(void, void, void, void, void, llm_kernels::nvidia::RotaryEmbeddingCuda环境: python 3.10.13 gcc 11.4.0 Package Version
absl-py 2.1.0 aiohttp 3.9.5 aiosignal 1.3.1 annotated-types 0.7.0 anyio 4.4.0 async-timeout 4.0.3 attrs 23.2.0 certifi 2024.7.4 charset-normalizer 3.3.2 click 8.1.7 einops 0.8.0 exceptiongroup 1.2.2 fastapi 0.110.0 filelock 3.15.4 flash-attn 2.2.1 frozenlist 1.4.1 fsspec 2024.6.1 h11 0.14.0 huggingface-hub 0.24.3 idna 3.7 Jinja2 3.1.4 joblib 1.4.2 MarkupSafe 2.1.5 mpmath 1.3.0 msgpack 1.0.8 multidict 6.0.5 networkx 3.3 ninja 1.11.1.1 nltk 3.8.1 numpy 2.0.1 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 9.1.0.70 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.5.82 nvidia-nvtx-cu12 12.1.105 packaging 24.1 pillow 10.4.0 pip 24.0 protobuf 4.24.4 pydantic 2.8.2 pydantic_core 2.20.1 PyYAML 6.0.1 regex 2024.7.24 requests 2.32.3 rouge_score 0.1.2 safetensors 0.4.3 sentencepiece 0.2.0 setuptools 69.5.1 six 1.16.0 sniffio 1.3.1 starlette 0.36.3 sympy 1.13.1 tokenizers 0.15.2 torch 2.4.0 torchaudio 2.4.0 torchvision 0.19.0 tqdm 4.66.4 transformers 4.39.2 triton 3.0.0 typing_extensions 4.12.2 urllib3 2.2.2 uvicorn 0.29.0 wheel 0.43.0 yarl 1.9.4
nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Tue_Feb__7_19:32:13_PST_2023 Cuda compilation tools, release 12.1, V12.1.66 Build cuda_12.1.r12.1/compiler.32415258_0
cmake version 3.22.1