[x] in reanalyze_worker.py, fix the issue of unexpectedly reset the value_prefix.
[x] in replay_buffer.py, fix the issue of saving zero-len game.
[x] in selfplay_worker.py, fix the issue of the failure of various seeds if rank=0.
[x] in train.py, add masks for policy/value/reward loss
Add some features:
[+] add the version of gym, opencv and kornia
[+] record the average steps of trajectories in test.
[x] in reanalyze_worker.py, fix the issue of unexpectedly reset the value_prefix.
[x] in replay_buffer.py, fix the issue of saving zero-len game.
[x] in selfplay_worker.py, fix the issue of the failure of various seeds if rank=0.
[x] in train.py, add masks for policy/value/reward loss
Add some features:
[+] add the version of gym, opencv and kornia
[+] record the average steps of trajectories in test.