r_reg always 1 - Githubissues

wyhuai / SkillMimic

Official code release for the paper "SkillMimic: Learning Reusable Basketball Skills from Demonstrations"

Apache License 2.0

181 stars 13 forks source link

Hi, thank you for bringing up this issue!

The bug was due to an incorrect order in the refactored release version of the code. Specifically, the self._compute_hoi_observations() function generates a new tensor composed of references. Because of this, it's important to first calculate the reward using self._curr_obs and self._hist_obs before updating self._hist_obs.

I've corrected the order so that the reward is calculated first and then the history is updated. Sorry for the confusion, and thank you again for your feedback!

Please let me know if this fix works for you.

wyhuai / SkillMimic

r_reg always 1 #3