YeWR / EfficientZero

Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.
GNU General Public License v3.0
847 stars 131 forks source link

Question about the index of pad_child_visits_lst in selfplay_worker.py #34

Open puyuan1996 opened 1 year ago

puyuan1996 commented 1 year ago

Thanks for you open-sourced code very much.

I am very confused about this code segment in put_last_trajectory method in selfplay_worker.py:

In Line 69 , why is, pad_child_visits_lst = game_histories[i].child_visits[beg_index:end_index] rather than pad_child_visits_lst = game_histories[i].child_visits[:self.config.num_unroll_steps],

in my understanding, the game_histories[i].child_visits[0] is the child_visits of stacked obs game_histories[i].obs_history[beg_index],

is this a bug?

Looking forward to your reply!

YeWR commented 1 year ago

Thank you for your correction.

I think it should be a bug. Except for the observation history, all the other statistics (eg, visits, values, rewards) should be indexed from 0 instead of self.config.stacked_observations. This bug seems to cause misplaced data at the boundary.

Really thank you for your detailed reading. We will fix this these days and check out the performance :)

puyuan1996 commented 1 year ago

Really thank you for your reply.

Looking forward to the analysis experiment of the performance impact of this bug!!

Best Wishes.