Closed sanjayyyyyy closed 5 years ago
i am not 100% sure but i think the following is what they did:-
avg reward * 10 = % level solved
avg reward per timestamp = avg reward / avg ep length.
Unimax is correct that avg reward * 10 = % level solved, since each level of CoinRun has a reward of 10 when successfully completed.
I am trying to recreate the performance figure in the paper, How do i get the number of levels solved per time stamp as mentioned in the graph? Can you please tell me this for both the cases of training and testing, and also how do i get the average rewards per time stamp.?