jannerm / trajectory-transformer

Code for the paper "Offline Reinforcement Learning as One Big Sequence Modeling Problem"
https://trajectory-transformer.github.io
MIT License
455 stars 63 forks source link

Some questions about hyperparameters in newer version #2

Closed FallCicada closed 2 years ago

FallCicada commented 2 years ago

Dear Author,

I am very interested in your great works and trying to reproduce your experiment results.

Previously I have almost achieved the same score described in the older version of your paper (approximately 70 score on average in 3 3 datasets). But I noticed that you updated your paper in arXiv in November and the score for TT (quantile) went up to 78.9 on average in 3 3 datasets.

I also noticed that you listed your beam search hyperparameters in Appendix E, where k_act is 20. The listed hyperparameters has some discrepancy with your config file (config/offline.py), where default k_act is None and cdf_act is 0.6. I am wondering if you changed the hyperparameters and obtained a higher score. If so, could you please update your config file so that I can also reproduce your results?

Thanks!

jannerm commented 2 years ago

Yes, there were a few hyperparameter changes to accommodate the updated quantile discretization. The config file in the main branch should be current. In general, I also did a little bit of tuning to speed up planning (e.g., decreasing the horizon and beam width for the halfcheetah environments) in the situations where it didn't hurt performance too much. Those speed-ups are also reflected in the environment-specific overrides in the config file.