Some questions about hyperparameters in newer version

jannerm / trajectory-transformer

Code for the paper "Offline Reinforcement Learning as One Big Sequence Modeling Problem"

MIT License

455 stars 63 forks source link

Dear Author,

I am very interested in your great works and trying to reproduce your experiment results.

Previously I have almost achieved the same score described in the older version of your paper (approximately 70 score on average in 3 3 datasets). But I noticed that you updated your paper in arXiv in November and the score for TT (quantile) went up to 78.9 on average in 3 3 datasets.

I also noticed that you listed your beam search hyperparameters in Appendix E, where k_act is 20. The listed hyperparameters has some discrepancy with your config file (config/offline.py), where default k_act is None and cdf_act is 0.6. I am wondering if you changed the hyperparameters and obtained a higher score. If so, could you please update your config file so that I can also reproduce your results?

Thanks!

jannerm / trajectory-transformer

Some questions about hyperparameters in newer version #2