Question about linear value function approximation exercises

ronaldseoh commented 7 years ago

Hi, I've recently been working on the function approximation exercises. Q-learning (I also tried sarsa as well) algorithm with FA runs ok for the default 100 episodes, but for 1000+ episodes it frequently gets stuck at -200 reward for quite long, usually until the end of the training. In other cases however(separate instance of the algorithm), the training progresses without getting stuck at -200. This was the case for the solution version as well. I was wondering if this behavior is something to be expected (maybe due to the fact that we are using SGD?), or there's something actually wrong in the code.

sarsa_1000_warm_02 stuck_01 sarsa_1000_02

anhtran1995 commented 6 years ago

I got similar results. Did you figure out if there is something wrong with the code?

DylanHaiyangChen commented 6 years ago

I also got the same results. I think it may be due to the different version of the package, such as scikit-learn.

dennybritz / reinforcement-learning

Question about linear value function approximation exercises #100