Deep learning dataset as benchmark

jbuckman / tiopifdpo

GNU General Public License v3.0

6 stars 1 forks source link

I was wondering if you still have the dataset you used to create the deep learning graphs in your paper. I think these datasets can be a very interesting benchmark. The deep offline RL space is currently missing a great discrete benchmark, and I think your experiments could be a great one.

I'm asking because I'm trying to replicate your experiment for an uncertainty-based pessimistic version of DQN. However, testing and benchmarking the algorithm remains a challenge due to the lack of a good offline RL benchmark. If you don't have them, I will try to recreate the datasets myself. However, for compression purposes, it would be more interesting if I used the same dataset.

Ps, I loved your paper.

Thanks for the interest & positive feedback!

I do not have the specific datasets that I used, and even if I did, I would not share them ;) The most important source of variance in the outcomes is in the specific samples drawn from the environment. So, reporting "different" seeds but using the same dataset for each one of them is not a good idea. (It's fine to use the same dataset for each of the different algorithms being compared, though.) In my paper, I reported results for 3 seeds, and those actually corresponded to 3 different datasets per experiment.

That said, all the code you need to easily generate your own datasets is available here in this repo. The README gives the steps for how to generate them.

jbuckman / tiopifdpo

Deep learning dataset as benchmark #1