Add training and inference support for RWKV LSTMs

zhongkaifu / Seq2SeqSharp

Seq2SeqSharp is a tensor based fast & flexible deep neural network framework written by .NET (C#). It has many highlighted features, such as automatic differentiation, different network types (Transformer, LSTM, BiLSTM and so on), multi-GPUs supported, cross-platforms (Windows, Linux, x86, x64, ARM), multimodal model for text and images and so on.

Other

193 stars 38 forks source link

Is your feature request related to a problem? Please describe. Your Seq2SeqSharp project already support LSTMs. Please consider to implement the RWKV large language "linear attention" idea into your c# solution. The linear attention model of RWKV has a great performance on inference. See: https://www.rwkv.com/

Describe the solution you'd like Maybe it just needs a few functions to implement into Seq2SeqSharp LSTM functionality like "token shift" or "time decay". Maybe you have another idea how to improve the LSTM performance in Seq2SeqSharp. I would like to implement your solution into the Godot Game Engine for training, fine tuning and inference in pure c# code.

Describe alternatives you've considered Use "https://github.com/imxcstar/CSharp-RWKV" for inference only.

zhongkaifu / Seq2SeqSharp

Add training and inference support for RWKV LSTMs #88