tinkoff-ai / CORL

High-quality single-file implementations of SOTA Offline and Offline-to-Online RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC, LB-SAC, SPOT, Cal-QL, ReBRAC
https://arxiv.org/abs/2210.07105
Apache License 2.0
1.08k stars 131 forks source link

IQL minor changes #42

Closed DT6A closed 1 year ago

DT6A commented 1 year ago

Fixing IQL.

One of the problems was pointed here: https://github.com/tinkoff-ai/CORL/pull/41

Another problem is a usage of only one of the critics during updates while two must be used

Reruns on all datasets are needed

DT6A commented 1 year ago

@odelalleau thanks for your comments.

Yes, you are right I missed that TwinQ returns minimum of two Qs and this changes can be reverted.

Talking about LOG_STD_MIN I just decided to check https://github.com/typoverflow/OfflineRL-Lib implementation and took this value from there and as far as I know this is more common to use -20.

set_to_none=True is removed to eliminate confusion as it is used rarely.

vkurenkov commented 1 year ago

@DT6A Here is the finished sweep for the fixed IQL. Can you please update the corresponding report? And then I believe we are good to merge.

DT6A commented 1 year ago

@vkurenkov report is updated

vkurenkov commented 1 year ago

@DT6A we also need to update the readme Ok, separate PR for readme