Closed HYDesmondLiu closed 2 years ago
Hi @HYDesmondLiu, Thanks for reaching out. This is indeed an important point, which could cause bias in comparison (For the adroit datasets, we used v0 so I think the comparison is fine there). We used v2 mujoco dataset in the experiments, because we were told by the d4rl authors that there were bugs in the v0 dataset and they suggested us to start with v2. That said, we will add experimental results of v0 in the revision. Please give us some time to produce the results. Thanks.
Hi @chinganc, thanks for noticing this. Could you share what are the bugs in D4RL v0? It would be quite useful.
I don't know the specifics but only that there were (minor) bugs in v0 and v1. The d4rl github page mentions bugs of hopper. I would suggest you to reach out to the authors to learn more.
Hi @HYDesmondLiu, we updated the paper. You can find the v0 results in Table 3 in Appendix.
Thanks for sharing the codes. I have one question. It seems like you are using D4RL v2 (C.2.), and in Table 1 you mention that "the baseline results are from the respective papers". However, some previous papers were using D4RL v0. I believe the buffer quality is varied from v0 to v2 (see TD3BC paper). Thus, the comparison might be biased.