Farama-Foundation / D4RL

A collection of reference environments for offline reinforcement learning
Apache License 2.0
1.29k stars 278 forks source link

Comparison between algorithms #111

Open IanWangg opened 3 years ago

IanWangg commented 3 years ago

Hi, I am wondering if there are training step limits for comparing offline RL algorithms based on d4rl dataset? Many papers mentioned that they used 1e6 gradient step for training, is it required to use 1e6 steps for legit comparison? Or can I let the agent learn for more than that number of steps?

gxywy commented 2 years ago

same question