fix: fill 0 for reward, return, value in make_batch()

DeNA / HandyRL

HandyRL is a handy and simple framework based on Python and PyTorch for distributed reinforcement learning that is applicable to your own environments.

MIT License

282 stars 39 forks source link

fix: fill 0 for reward, return, value in make_batch() #345

Closed YuriCat closed 9 months ago

YuriCat commented 1 year ago

Return and reward are scalar values in the main code.