About the reporduction of experiment Cooperative Treasure Collection

Thank for sharing the source code of MAAC. This is a very interesting papar. When i reproduce the experiments, the result of Cooperative Treasure Collection is quite different from the paper's. The parameter of episode_length is 100, and the source code statistics are the average of 100 steps for each agent. def get_average_rewards(self, N): if self.filled_i == self.max_steps: inds = np.arange(self.curr_i - N, self.curr_i) # allow for negative indexing else: inds = np.arange(max(0, self.curr_i - N), self.curr_i) return [self.rew_buffs[i][inds].sum() for i in range(self.num_agents)] Therefore, i sum the values of each agent to get the result shown in the figure. results So, i want to know how to calculate the results in original paper! Hope for you reply!!!

Wei Zhou, zhouwei14@nudt.edu.cn

shariqiqbal2810 / MAAC

About the reporduction of experiment Cooperative Treasure Collection #15