RedTachyon / cpr_reputation

0 stars 0 forks source link

fix sustainability metric #47

Closed bengreenberg5 closed 3 years ago

bengreenberg5 commented 3 years ago

Old sustainability metric was calculating something like the fraction of agent-timesteps with nonzero reward. What we want is S: the average timestep number where an agent receives a nonzero reward.

Screen Shot 2021-05-09 at 19 46 15

Intuitively, if agents are gathering apples later and later into the episode, the environment is becoming more sustainable.