Closed goktug97 closed 3 years ago
In sgd environment the log rewards should be negated because
log(1e-5) = -11.512925464970229 log(1e-6) = -13.815510557964274 # Should be bigger not smaller
Log Diff:
>>> np.log(0.00000001) - np.log(0.1) -16.11809565095832 >>> np.log(0.0000000000000000001) - np.log(0.1) # Should be bigger not smaller -41.44653167389283 >>> np.log(0.0000000000000000001) - np.log(0.0000001) -27.63102111592855
Fixed: https://github.com/automl/DACBench/commit/f511030cf4381cfa772f91405a2e1bf89e50db0f
In sgd environment the log rewards should be negated because
Log Diff: