DDQN isn't learning as expected

mendezja / Mariox

CSCI 319 Final Project

3 stars 2 forks source link

DDQN isn't learning as expected #28

Closed wxue24 closed 11 months ago

wxue24 commented 11 months ago

The reward plot as shown above is decreasing rather than increasing over time. Could be due to hyperparameters chosen, or how the state features are preprocessed.

Some ideas to try:

Preprocess data (normalize values like position to between 0 and 1)
Experiment with different neural network layers
Experiment with different hyper parameters like "burnin", "learn_every", "sync_every", and "save_every", "batch_size", "exploration_rate"

wxue24 commented 11 months ago

After adding preprocessing to the data and adjusting the hyperparameters I got a much better reward plot.

We can try to make more adjustments to decrease the training time but it seems like the result is pretty good as the end reward is > 100.