CarperAI / DRLX

Diffusion Reinforcement Learning Library
MIT License
171 stars 7 forks source link

Reward model inference #23

Open shahbuland opened 1 year ago

shahbuland commented 1 year ago

Need to add reward model inference for when the RM is a sizable model. Currently attempts to have RM on each GPU. This is problematic because there are many cases where RM is too big to fit alongside the denoiser model. Solution in LLM case is often to use Triton inference server or to put RM on one gpu while main model uses rest of GPUs. Should be explored further.