RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
Apache License 2.0
12.61k
stars
859
forks
source link
The experimental results on the homepage are based on zero-shot learning? #52
I was wondering if I could ask you a question about your experiment results. Specifically, I'm curious to know if your results were based on zero-shot, one-shot, or few-shot conditions.。 experiment results lnk
Dear author
I was wondering if I could ask you a question about your experiment results. Specifically, I'm curious to know if your results were based on zero-shot, one-shot, or few-shot conditions.。 experiment results lnk