Open Derek-Hardy opened 3 years ago
Optimisation:
Testing:
Average score: 1.216
TODO:
step(self, state, *args, **kwargs)
in __init__.py
may need improve (e.g. MCTS)Optimisation:
Testing:
[t2_tmax50] 300 run(s) avg rewards : 6.0
[t2_tmax40] 300 run(s) avg rewards : 5.1
Point: 5.550000000000001
Local runtime: 228.86762881278992 seconds --- fast
WARNING: do note that this might not reflect the runtime on the server.
BIG PROGRESS ❗
Optimisation:
Testing:
[t2_tmax50] 300 run(s) avg rewards : 6.7
[t2_tmax40] 300 run(s) avg rewards : 6.9
Point: 6.833333333333334
Local runtime: 261.7694010734558 seconds --- safe
WARNING: do note that this might not reflect the runtime on the server.
Some more improvements needed, let's aim for >= 8.5
Optimisation:
Testing:
[t2_tmax50] 300 run(s) avg rewards : 7.3
[t2_tmax40] 300 run(s) avg rewards : 6.9
Point: 7.1
Local runtime: 280.41088366508484 seconds --- safe
WARNING: do note that this might not reflect the runtime on the server.
Optimisation:
Testing:
[t2_tmax50] 300 run(s) avg rewards : 5.4
[t2_tmax40] 300 run(s) avg rewards : 5.3
Point: 5.35
Local runtime: 256.49775671958923 seconds --- safe
❗ Over-fitting
Optimisation:
Testing:
[t2_tmax50] 300 run(s) avg rewards : 7.3
[t2_tmax40] 300 run(s) avg rewards : 6.9
Point: 7.1
Local runtime: 280.41088366508484 seconds --- safe
WARNING: do note that this might not reflect the runtime on the server.
AVLE Score decreased from 6.816 to 6.683. (-0.133)
Problem:
Direction: