Open vinerich opened 3 years ago
Hmm, it seems this should not happen in latest version of scipy as discussed in https://github.com/scipy/scipy/issues/11403 and https://github.com/scipy/scipy/issues/13277.
Please tell me the followings. What version of scipy and numpy are you using? In my case scipy==1.6.3 and numpy==1.20.3. Does it occur at the LunarLanderContinuous-v2 example? https://github.com/vinerich/mpo
I think if you want to clamp, this should be like
I was wrong, np.clip(self.η, -1e-6, 1e-6)
, not np.max([self.η,1e-6])
.np.max([self.η,1e-6])
makes sense.
scipy=1.6.3 numpy=1.20.2
I looked over to both issues mentioned and I experience the warning mentioned by https://github.com/scipy/scipy/issues/13277 frequently. So it seems working.
But sometimes it still gives me above error. Sadly I can't reproduce this as I don't had the proper logging setup and it only occurs roughly once every ~4 million timesteps.
I will check onto the LunarLanderContinous and let it running for a day or so and report back.
Hey again.
While evaluating MPO I got some strange
raise ValueError("
x0violates bound constraints.")
. They originate in this line. However I now implemented a "clamping" withnp.max([self.η,1e-6])
.According to their code to check for the bound constraints, this should be totally fine. But I keep getting this error from time to time and the training for the algorithm completely stops as it errors out.
Lines from the corresponding file
scipy/optimize/_numdiff.py
:Bounds are prepared like this:
Any idea to this? Not really an algorithm related question but for me this seems strange.