AI4Finance-Foundation / ElegantRL

Massively Parallel Deep Reinforcement Learning. 🔥
3.77k stars 852 forks source link

not deterministic #123

Open Timodhau opened 2 years ago

Timodhau commented 2 years ago

Hello, I ran elegant rl with finrl processor using the function DRLAgent_erl.DRL_prediction and it seemed to not be deterministic.

Yonv1943 commented 2 years ago

Perhaps the stochasticity is brought by the env.reset().

Because both stochastic and determinstic policy algorithms will use determinstic policy by default during the testing phase.

Timodhau commented 2 years ago

Well i took a look just before : with _torch.no_grad(): for i in range(environment.max_step): s_tensor = _torch.as_tensor((state,), device=device) a_tensor = act(s_tensor) # action_tanh = act.forward() action = ( a_tensor.detach().cpu().numpy()[0] ) # not need detach(), because with torch.no_grad() outside state, reward, done, _ = environment.step(action) in file, my states are similar. Maybe i introduced an error myself but I don't think so.