Stable-Baselines-Team / stable-baselines3-contrib

Contrib package for Stable-Baselines3 - Experimental reinforcement learning (RL) code
https://sb3-contrib.readthedocs.io
MIT License
504 stars 175 forks source link

[Feature Request] Implement CrossQ #238

Closed danielpalen closed 1 month ago

danielpalen commented 7 months ago

🚀 Feature

I would like to implement CrossQ (https://openreview.net/pdf?id=PczQtTsTIX) in SB3, as also suggested by @araffin (https://github.com/araffin/sbx/pull/36#issuecomment-2027392759),

Motivation

CrossQ is one of the current state-of-the-art deep reinforcement learning methods in terms of sample efficiency and substantionally more computationally efficefficient than the previous state-of-the-art (e.g. DroQ or REDQ), as it uses a low update-to-data ratio of 1. It is the first successful application of batch normalization within deep reinforcement learning, which is at the heart of it's efficiency. I think a PyTorch based reference implementation in SB3 would be very valuable for the research community.

Pitch

As one of the first authors on the paper, I want to contribute a PyTorch based reference implementation of CrossQ to SB3, since the paper's implementation is in JAX.

Alternatives

No response

Additional context

No response

Checklist