mimoralea / gdrl

Grokking Deep Reinforcement Learning
https://www.manning.com/books/grokking-deep-reinforcement-learning
BSD 3-Clause "New" or "Revised" License
812 stars 234 forks source link

A2C algorithm doesnt work #37

Open Acejoy opened 8 months ago

Acejoy commented 8 months ago

Hey , I tried chapter-11 notebook and ran the A2C algorithm for training. The agent doesnt learn anything. I am pasting the logs below:

el 00:00:36, ep 1427, ts 013537, ar 10 001.0±000.0, 100 001.0±000.0, ex 100 0.0±0.0, ev 009.5±000.7
el 00:01:06, ep 3335, ts 031432, ar 10 001.0±000.0, 100 001.0±000.0, ex 100 0.0±0.0, ev 009.5±000.6
el 00:01:37, ep 5295, ts 049762, ar 10 001.0±000.0, 100 001.0±000.0, ex 100 0.0±0.0, ev 009.3±000.8
el 00:02:07, ep 7101, ts 066745, ar 10 001.0±000.0, 100 001.0±000.0, ex 100 0.0±0.0, ev 009.5±000.7
el 00:02:37, ep 8996, ts 084474, ar 10 001.0±000.0, 100 001.0±000.0, ex 100 0.0±0.0, ev 009.2±000.8
el 00:03:07, ep 10812, ts 101495, ar 10 001.0±000.0, 100 001.0±000.0, ex 100 0.0±0.0, ev 009.3±000.7
el 00:03:37, ep 12534, ts 117600, ar 10 001.0±000.0, 100 001.0±000.0, ex 100 0.0±0.0, ev 009.4±000.9
el 00:04:07, ep 14259, ts 133695, ar 10 001.0±000.0, 100 001.0±000.0, ex 100 0.0±0.0, ev 009.4±000.8
el 00:04:37, ep 15967, ts 149650, ar 10 001.0±000.0, 100 001.0±000.0, ex 100 0.0±0.0, ev 009.4±000.7
el 00:05:07, ep 17651, ts 165403, ar 10 001.0±000.0, 100 001.0±000.0, ex 100 0.0±0.0, ev 009.4±000.7
el 00:05:37, ep 19366, ts 181470, ar 10 001.0±000.0, 100 001.0±000.0, ex 100 0.0±0.0, ev 009.4±000.8
el 00:06:07, ep 21014, ts 197042, ar 10 001.0±000.0, 100 001.0±000.0, ex 100 0.0±0.0, ev 009.3±000.8
el 00:06:37, ep 22672, ts 212565, ar 10 001.0±000.0, 100 001.0±000.0, ex 100 0.0±0.0, ev 009.5±000.8
el 00:07:07, ep 24350, ts 228283, ar 10 001.0±000.0, 100 001.0±000.0, ex 100 0.0±0.0, ev 009.5±000.8
el 00:07:37, ep 25916, ts 242947, ar 10 001.0±000.0, 100 001.0±000.0, ex 100 0.0±0.0, ev 009.3±000.8

....

I tried experimenting with entropy_weight but the result remains the same.

Can anyone point out the mistake? Thanks