I have a question regarding the relationship between the x-axis labels in Figures 5 and 6 of your paper and the output from the PPO algorithm in the repository.
What is the relationship between the x-axis label "environment steps" in Figures 5 and 6 of the paper and the "total_timesteps" output shown in the terminal when running the PPO algorithm from this repository?
How are "environment steps" and "total_timesteps" defined and calculated specifically in this context?
Dear Author,
I have a question regarding the relationship between the x-axis labels in Figures 5 and 6 of your paper and the output from the PPO algorithm in the repository.
Thank you for your help.