Open Fizzmy opened 2 years ago
What is your GPU device model and what is the cuda version? If you compile on your own device, you can take a try to modify here(https://github.com/bytedance/lightseq/blob/master/CMakeLists.txt#L88) to "set(CMAKE_CUDA_ARCHITECTURES 70 75 80 86)".
Because CMAKE_CUDA_ARCHITECTURES 60/61 don't support atomicAdd(__half*, float)
What is your GPU device model and what is the cuda version? If you compile on your own device, you can take a try to modify here(https://github.com/bytedance/lightseq/blob/master/CMakeLists.txt#L88) to "set(CMAKE_CUDA_ARCHITECTURES 70 75 80 86)".
My GPU is Tesla P100 and cuda version is 11.1. I found that my CUDA_ARCH is 60. How could I complie it without FP16?
What is your GPU device model and what is the cuda version? If you compile on your own device, you can take a try to modify here(https://github.com/bytedance/lightseq/blob/master/CMakeLists.txt#L88) to "set(CMAKE_CUDA_ARCHITECTURES 70 75 80 86)".
My GPU is Tesla P100 and cuda version is 11.1. I found that my CUDA_ARCH is 60. How could I complie it without FP16?
https://github.com/bytedance/lightseq/blob/master/CMakeLists.txt#L7
What is your GPU device model and what is the cuda version? If you compile on your own device, you can take a try to modify here(https://github.com/bytedance/lightseq/blob/master/CMakeLists.txt#L88) to "set(CMAKE_CUDA_ARCHITECTURES 70 75 80 86)".
My GPU is Tesla P100 and cuda version is 11.1. I found that my CUDA_ARCH is 60. How could I complie it without FP16?
https://github.com/bytedance/lightseq/blob/master/CMakeLists.txt#L7
-DFP16_MODE=OFF
Thank you, I complie successfully. What I do is:
1.modified (https://github.com/bytedance/lightseq/blob/master/CMakeLists.txt#L88) to "set(CMAKE_CUDA_ARCHITECTURES 60 61 70 75)".
2.modified /usr/local/cuda/include/thrust/system/cuda/config.h #ifndef THRUST_IGNORE_CUB_VERSION_CHECK
to #ifdef THRUST_IGNORE_CUB_VERSION_CHECK
3.compile the code with command cmake -DCMAKE_BUILD_TYPE=Release -DFP16_MODE=OFF -DDYNAMIC_API=ON .. && make -j
If I want to run py test code( such as test_ls_layers_new.py), what should I do after building all targets?
If you want to run the test code, you can directly python3 test/xxx.py
If you want to run the test code, you can directly python3 test/xxx.py
Then I met the first error 😂
I try to build from source code though document
docs/inference/build.md
. And I realized that this document is out of date.( Because only cuda11 supports the header#include <cooperative_groups/reduce.h>
) Then I tried to run lightseq3.0 in cuda11. But I encountered another error: no instance of overloaded function "atomicAdd" matches the argument list argument types are: (__half *, float)So I wonder that if this error is about CUDA_ARCH? How can I run this code in a less powerful GPU?