Closed starry-sky6688 closed 3 years ago
Hi @starry-sky6688 ,
The transition function of the environment is indeed stochastic. cooldown
, for example, which is the time
that the agents need to wait until being able to shoot again, is probabilistic. The reward function is deterministic.
Hi, I'm wordering about the dynamic of this environment, are the state transition function and reward function stochastic?
Looking forward to your reply!