Closed bryanbocao closed 4 years ago
Hello,
According to the python reference, the precedence of the "not" operator is higher than the precedence of the "and" operator which is higher than the precedence of the "or" operator.
Which means that:
done = (not np.isfinite(ob).all()) or (np.abs(ob[1]) > .2)
is equivalent to:
done = not np.isfinite(ob).all() or np.abs(ob[1]) > .2
ie the line of code in the repo.
Also,
notdone = np.isfinite(ob).all() and (np.abs(ob[1]) <= .2)
done = not notdone
is equivalent to:
done = not (np.isfinite(ob).all() and (np.abs(ob[1]) <= .2))
not:
done = not np.isfinite(ob).all() and (np.abs(ob[1]) <= .2)
due to the precedence of "not" over "and", the parenthesis are required.
Hi, thanks for pointing out.
Yes, I agree on both arguments
Which means that:
done = (not np.isfinite(ob).all()) or (np.abs(ob[1]) > .2)
is equivalent to:done = not np.isfinite(ob).all() or np.abs(ob[1]) > .2
The reason why I retain the parenthesis in (not np.isfinite(ob).all())
and (np.abs(ob[1]) > .2)
is to make the logic clear with less confusion.
2.
notdone = np.isfinite(ob).all() and (np.abs(ob[1]) <= .2) done = not notdone
is equivalent to:
done = not (np.isfinite(ob).all() and (np.abs(ob[1]) <= .2))
not:
done = not np.isfinite(ob).all() and (np.abs(ob[1]) <= .2)
due to the precedence of "not" over "and", the parenthesis are required.
Thanks for pointing out my previous mistake in step1 but steps from step2 onwards are correct.
done = not (np.isfinite(ob).all() and (np.abs(ob[1]) <= .2))
is the same as my step2.
Thanks!
To conclude, the logic of https://github.com/benelot/pybullet-gym/blob/79396fccc1d0e170a092b50137148da48ee2edab/pybulletgym/envs/mujoco/envs/pendulum/inverted_pendulum_env.py#L32
is equivalent to that in the original OpenAI Gym.
Thanks for the efforts in making this repo!
Looks like the code was trying to combine these two lines of code from the original OpenAI Gym:
code0:
into one line of code
code1:
https://github.com/benelot/pybullet-gym/blob/79396fccc1d0e170a092b50137148da48ee2edab/pybulletgym/envs/mujoco/envs/pendulum/inverted_pendulum_env.py#L32
But if we follow the logic to convert code0 to code1-like version step by step, shouldn't it be:
=> step1
done = not np.isfinite(ob).all() and (np.abs(ob[1]) <= .2)=> step2
=> step3
=> step4
code2:
Note the difference between the precedences of bracket and 'not' in code1 and code2.Update on Y2020M06D08Mon, code1 is equivalent to code2.