Ahmed-AmineHomman / dibujito

Simple web app enhancing your prompts for genAI image models
MIT License
0 stars 0 forks source link

Use RWKV local model for LLM optimization #9

Open Ahmed-AmineHomman opened 2 months ago

Ahmed-AmineHomman commented 2 months ago

RWKV.cpp is a solution allowing to use RWKV models, which are LLMs based on a RNN-architecture. This allows the models to scale linearly in memory use according to the input size and most of all performs operations in a sequential manner, making them very CPU-friendly. The problem arising with RNN-based LLMs was initially that they could not match their transformers-based equivalent initially. But the RWKV models (and the team behind them) managed to solve this: the latest version of the RWKV family of models has very good performance and is on par with similar-sized transformers-based LLMs.

The idea of this issue is therefore to implement a wrapper around RWKV.cpp in order to allow the LLM prompt-optimization to be carried by RWKV models. These should run on the CPU RAM, therefore leaving all the VRAM for the diffusion model. This allows for two benefits: