smilesun / rlR

Deep Reinforcement Learning in R (Deep Q Learning, Policy Gradient, Actor-Critic Method, etc)
https://smilesun.github.io/rlR
Other
26 stars 4 forks source link

Each model of every episode predicts the same values #20

Closed SebGGruber closed 6 years ago

SebGGruber commented 6 years ago

Hey!

Problem is basically in the title. I tested with CartPole-v0 + AgentDQN and ran 100 learning iterations. Got the same predictions for all i in 1:100 with perf$list_models[[i]]$pred(...). Am I missing something here, or is this a bug?

Greetings, Sebastian

smilesun commented 6 years ago

I think this is a bug also in Keras, see keras issue here https://github.com/keras-team/keras/issues/1765 I will fix it soon

smilesun commented 6 years ago

fixed, Latest commit d1ab927