RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
include
include
include "ATen/ATen.h"
define MIN_VALUE (-1e38)
typedef at::BFloat16 bf16;
global void kernel_forward(const int B, const int T, const int C, const float restrict const _w, const bf16 restrict const _u, const bf16 restrict const _k, const bf16 restrict const _v, bf16 restrict const _y) { const int idx = blockIdx.x blockDim.x + threadIdx.x; const int _b = idx / C; const int _c = idx % C; const int _offset = _b T C + _c;
} [{ "resource": "/root/RWKV-LM-main/RWKV-v4neo/cuda/wkv_cuda_bf16.cu", "owner": "C/C++: IntelliSense", "code": "77", "severity": 8, "message": "此声明没有存储类或类型说明符", "source": "C/C++", "startLineNumber": 7, "startColumn": 1, "endLineNumber": 7, "endColumn": 11 }] 之前全部#include也有错误 我在配置里添加路径解决了