tinkoff-ai / CORL

High-quality single-file implementations of SOTA Offline and Offline-to-Online RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC, LB-SAC, SPOT, Cal-QL, ReBRAC
https://arxiv.org/abs/2210.07105
Apache License 2.0
1.08k stars 131 forks source link

LB-SAC implementation #31

Closed Howuhh closed 1 year ago

Howuhh commented 1 year ago

Implementation of Q-ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch Size.

TODO:

vkurenkov commented 1 year ago

original implementation for reference: https://github.com/tinkoff-ai/lb-sac

Howuhh commented 1 year ago

wandb report: https://wandb.ai/tlab/CORL/reports/LB-SAC-D4RL-Results--VmlldzozNjIxMDY1