RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
mitting ninja build file /home/hope/.cache/torch_extensions/py310_cu117/wkv_1024/build.ninja...
Building extension module wkv_1024...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=wkv_1024 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include/TH -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include/THC -isystem /home/hope/miniconda3/envs/rwkv/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -res-usage --maxrregcount 60 --use_fast_math -O3 -Xptxas -O3 --extra-device-vectorization -DTmax=1024 -std=c++14 -c /home/hope/work/RWKV-LM/RWKV-v4neo/cuda/wkv_cuda.cu -o wkv_cuda.cuda.o
FAILED: wkv_cuda.cuda.o
/usr/bin/nvcc -DTORCH_EXTENSION_NAME=wkv_1024 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include/TH -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include/THC -isystem /home/hope/miniconda3/envs/rwkv/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -res-usage --maxrregcount 60 --use_fast_math -O3 -Xptxas -O3 --extra-device-vectorization -DTmax=1024 -std=c++14 -c /home/hope/work/RWKV-LM/RWKV-v4neo/cuda/wkv_cuda.cu -o wkv_cuda.cuda.o
In file included from /usr/include/cuda_runtime.h:83,
from :
/usr/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1900, in _run_ninja_build
subprocess.run(
File "/home/hope/miniconda3/envs/rwkv/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/hope/work/RWKV-LM/RWKV-v4neo/train.py", line 307, in
from src.model import RWKV
File "/home/hope/work/RWKV-LM/RWKV-v4neo/src/model.py", line 80, in
wkvcuda = load(name=f"wkv{T_MAX}", sources=["cuda/wkv_op.cpp", "cuda/wkv_cuda.cu"], verbose=True, extra_cuda_cflags=["-res-usage", "--maxrregcount 60", "--use_fast_math", "-O3", "-Xptxas -O3", "--extra-device-vectorization", f"-DTmax={T_MAX}"])
File "/home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load
return _jit_compile(
File "/home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1508, in _jit_compile
_write_ninja_file_and_build_library(
File "/home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1623, in _write_ninja_file_and_build_library
_run_ninja_build(
File "/home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1916, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'wkv_1024'
mitting ninja build file /home/hope/.cache/torch_extensions/py310_cu117/wkv_1024/build.ninja... Building extension module wkv_1024... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/2] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=wkv_1024 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include/TH -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include/THC -isystem /home/hope/miniconda3/envs/rwkv/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -res-usage --maxrregcount 60 --use_fast_math -O3 -Xptxas -O3 --extra-device-vectorization -DTmax=1024 -std=c++14 -c /home/hope/work/RWKV-LM/RWKV-v4neo/cuda/wkv_cuda.cu -o wkv_cuda.cuda.o FAILED: wkv_cuda.cuda.o /usr/bin/nvcc -DTORCH_EXTENSION_NAME=wkv_1024 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include/TH -isystem /home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/include/THC -isystem /home/hope/miniconda3/envs/rwkv/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -res-usage --maxrregcount 60 --use_fast_math -O3 -Xptxas -O3 --extra-device-vectorization -DTmax=1024 -std=c++14 -c /home/hope/work/RWKV-LM/RWKV-v4neo/cuda/wkv_cuda.cu -o wkv_cuda.cuda.o In file included from /usr/include/cuda_runtime.h:83, from:
/usr/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1900, in _run_ninja_build
subprocess.run(
File "/home/hope/miniconda3/envs/rwkv/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/hope/work/RWKV-LM/RWKV-v4neo/train.py", line 307, in
from src.model import RWKV
File "/home/hope/work/RWKV-LM/RWKV-v4neo/src/model.py", line 80, in
wkvcuda = load(name=f"wkv{T_MAX}", sources=["cuda/wkv_op.cpp", "cuda/wkv_cuda.cu"], verbose=True, extra_cuda_cflags=["-res-usage", "--maxrregcount 60", "--use_fast_math", "-O3", "-Xptxas -O3", "--extra-device-vectorization", f"-DTmax={T_MAX}"])
File "/home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load
return _jit_compile(
File "/home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1508, in _jit_compile
_write_ninja_file_and_build_library(
File "/home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1623, in _write_ninja_file_and_build_library
_run_ninja_build(
File "/home/hope/miniconda3/envs/rwkv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1916, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'wkv_1024'