wilsonteng97 / AI-Planning-Decision-Making

Assignments submitted for CS4246/CS5446 AI Planning & Decision Making
0 stars 1 forks source link

[Mini-project] Improve on DQN #22

Open Derek-Hardy opened 3 years ago

Derek-Hardy commented 3 years ago

Problem:


Direction:

Derek-Hardy commented 3 years ago

Optimisation:

Testing:

Screenshot 2021-04-07 at 16 50 11

Average score: 1.216

TODO:

Derek-Hardy commented 3 years ago

Optimisation:

Testing:

[t2_tmax50] 300 run(s) avg rewards : 6.0
[t2_tmax40] 300 run(s) avg rewards : 5.1
Point: 5.550000000000001
Local runtime: 228.86762881278992 seconds --- fast
WARNING: do note that this might not reflect the runtime on the server.
Screenshot 2021-04-08 at 15 18 35

BIG PROGRESS

Derek-Hardy commented 3 years ago

Optimisation:

Testing:

[t2_tmax50] 300 run(s) avg rewards : 6.7
[t2_tmax40] 300 run(s) avg rewards : 6.9
Point: 6.833333333333334
Local runtime: 261.7694010734558 seconds --- safe
WARNING: do note that this might not reflect the runtime on the server.
Screenshot 2021-04-09 at 23 44 11

Some more improvements needed, let's aim for >= 8.5

Derek-Hardy commented 3 years ago

Optimisation:

Testing:

[t2_tmax50] 300 run(s) avg rewards : 7.3
[t2_tmax40] 300 run(s) avg rewards : 6.9
Point: 7.1
Local runtime: 280.41088366508484 seconds --- safe
WARNING: do note that this might not reflect the runtime on the server.
Screenshot 2021-04-17 at 21 45 47
Derek-Hardy commented 3 years ago

Optimisation:

Testing:

[t2_tmax50] 300 run(s) avg rewards : 5.4
[t2_tmax40] 300 run(s) avg rewards : 5.3
Point: 5.35
Local runtime: 256.49775671958923 seconds --- safe

Over-fitting

wilsonteng97 commented 3 years ago

Optimisation:

Testing:

[t2_tmax50] 300 run(s) avg rewards : 7.3
[t2_tmax40] 300 run(s) avg rewards : 6.9
Point: 7.1
Local runtime: 280.41088366508484 seconds --- safe
WARNING: do note that this might not reflect the runtime on the server.

image

AVLE Score decreased from 6.816 to 6.683. (-0.133)