mendezja / Mariox

CSCI 319 Final Project
3 stars 2 forks source link

DDQN isn't learning as expected #28

Closed wxue24 closed 11 months ago

wxue24 commented 11 months ago
image

The reward plot as shown above is decreasing rather than increasing over time. Could be due to hyperparameters chosen, or how the state features are preprocessed.

Some ideas to try:

wxue24 commented 11 months ago

After adding preprocessing to the data and adjusting the hyperparameters I got a much better reward plot.

image

We can try to make more adjustments to decrease the training time but it seems like the result is pretty good as the end reward is > 100.