Question about TD-MPC2 - Githubissues

I just wanted to clarify something about the learning curve performance figures of your current version of the manuscript. It seems like that TD-MPC2 actually utilizes action repeat=2 and that the steps values in the TD-MPC2 config should actually be halved if you were to perform 10M environment steps. However, according to the default config values for steps in the TD-MPC2 config for humanoid bench, it seems like that you have set steps to 10M.

Were the actually "environment steps" taken into consideration? Or is this something that has been overlooked?

It would be awesome if this were to be clarified.

Best regards, Dongyoon

carlosferrazza / humanoid-bench

Question about TD-MPC2 #20