smilesun / rlR

Deep Reinforcement Learning in R (Deep Q Learning, Policy Gradient, Actor-Critic Method, etc)
https://smilesun.github.io/rlR
Other
26 stars 4 forks source link

Predictions from perf$list_models are off #21

Closed SebGGruber closed 6 years ago

SebGGruber commented 6 years ago

Hi, I hope this doesn't annoy you too much already, but the perf$list_models is still bugged. This time the predictions are indeed different for each list element, but they only differ very slightly and deliver totally different predictions than the default prediction method.


Code to recreate bug:

env = makeGymEnv() agent = makeAgent("AgentDQN", env) perf = agent$learn(1) perf$agent$brain$pred( array(c(1,1,1,1), dim = c(1,4)) ) # normal prediction perf$list_models[[1]]$pred( array(c(1,1,1,1), dim = c(1,4)) ) # supposed to be the same - but isn't


Even for higher iterations, the list predictions only change slights by iteration and never seem to be close to the normal one.

Greetings, Sebastian