Closed taoqinghua closed 5 months ago
If you are using the provided the docker image with tag qwenllm/qwen(:latest), it is based on CUDA 11.7 and bundles the layer_norm module from flash attention v2, where that invalid device function (cudaOccupancyMaxActiveBlocksPerMultiprocessor which is a CUDA runtime API) is called.
It is likely your nvidia driver is too old to support CUDA 11.7 (and later versions). Please run nvidia-smi
and provide the result.
If you are using the provided the docker image with tag qwenllm/qwen(:latest), it is based on CUDA 11.7 and bundles the layer_norm module from flash attention v2, where that invalid device function (cudaOccupancyMaxActiveBlocksPerMultiprocessor which is a CUDA runtime API) is called.
It is likely your nvidia driver is too old to support CUDA 11.7 (and later versions). Please run
nvidia-smi
and provide the result.
Wed Apr 10 06:16:11 2024
nvidia-smi驱动查询结果如下,感觉应该能支持CUDA11.7,有没有可能是别的什么原因呢
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.06 Driver Version: 545.23.06 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla P100-PCIE-16GB Off | 00000000:44:00.0 Off | 0 |
| N/A 27C P0 29W / 250W | 0MiB / 16384MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 Tesla P100-PCIE-16GB Off | 00000000:87:00.0 Off | 0 |
| N/A 27C P0 28W / 250W | 0MiB / 16384MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 2 Tesla P100-PCIE-16GB Off | 00000000:C1:00.0 Off | 0 |
| N/A 26C P0 30W / 250W | 0MiB / 16384MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 3 Tesla P100-PCIE-16GB Off | 00000000:C4:00.0 Off | 0 |
| N/A 26C P0 29W / 250W | 0MiB / 16384MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | No running processes found | +---------------------------------------------------------------------------------------+
Unfortunately, flash attention v2 does not support P100 (nor V100). You may need to uninstall the related packages in the image (pip uninstall flash_attn dropout_layer_norm
) or build the image from scratch and set environment variable BUNDLE_FLASH_ATTENTION
to false
.
Unfortunately, flash attention v2 does not support P100 (nor V100). You may need to uninstall the related packages in the image (
pip uninstall flash_attn dropout_layer_norm
) or build the image from scratch and set environment variableBUNDLE_FLASH_ATTENTION
tofalse
.
谢谢。
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
当前行为 | Current Behavior
在容器中运行bash finetune/finetune_qlora_single_gpu.sh -m /data/shared/test_docker/Qwen-7B-Chat-Int4/ -d /data/shared/test_docker/chat.json 时报错,显示CUDA Error: invalid device function /tmp/pip-req-build-5rlg4jgm/ln_fwd_kernels.cuh 236错误,cuda版本是11.7,更换成11.8、12.1、12.4版本的cuda显示还是同样错误。容器中找不到/tmp/pip-req-build-5rlg4jgm/ln_fwd_kernels.cuh这个文件,不知道具体原因是什么,有大神指点一二吗。
期望行为 | Expected Behavior
到底是什么原因,更换了多个cuda版本都显示同样的错误,求大神指点。
复现方法 | Steps To Reproduce
1、启动docker docker run -itd -v /***://data/shared/test_docker --name test_qwen --gpus all --shm-size 12G qwenllm/qwen /bin/bash 2、进入docker docker exec -it test_qwen04 /bin/bash 3、运行指令 bash finetune/finetune_qlora_single_gpu.sh -m /data/shared/test_docker/Qwen-7B-Chat-Int4/ -d /data/shared/test_docker/chat.json
运行环境 | Environment
备注 | Anything else?
No response