Closed Kallinteris-Andreas closed 6 months ago
@pseudo-rnd-thoughts regarding Critique 1
1) is it a good idea to change the default exact
argument of data_equivilance
to True
?
2) or change the make function call to include exact=True
(at least inside env_checker.py
)?
Sorry I didn't respond when this was originally posted.
I'm worried about using allequal
rather than allclose
due to float point numbers particularly for robotics environments.
An alternative solution for check_env
is to use both allclose
and allequal
.
For a given observation, we first check if allequal
, if False then run allclose
, if all close fails then we raise an error about that otherwise we just log to the users the non-equalness as an environment can be deterministic to e-8 but not equivalent.
I know from talking to the Mujoco devs that due to physics changes / optimisations that mujoco versions are not deterministic and can produce different observations however within e-10 or so I believe.
1) there is no point in running allequal
and then if it fails allclose
, this is effectively the same as running allclose
2) for a given version of the mujuco
library, the behavior will be 100% exact (unless we or the mujoco
team make a mistake)
Keep in mind checking floats for equality is always asking for trouble. Float operations are not associative, and strictly speaking, it's not even true that a == a
(nan) or a + b == a + b
(precision fuckery).
If there's any GPU involved (which might be the case even in mujoco due to the brax stuff?), nothing is deterministic at that point due to the parallel execution + nonassociativity.
Some opt-in flag to enable "strict" mode might be alright, but it's still a bit of a footgun threat.
You make a good point we should not change the default Behaviour of data_eqivilence
but for check_env I believe we should use exact=true.
there is no point in running allequal and then if it fails allclose, this is effectively the same as running allclose
My suggestion would run allequal
first then allclose
so there are cases where the first would fail but the second won't
@pseudo-rnd-thoughts
do we agree that:
allclose(DATA) == allequal(DATA) or allclose(DATA)
Yes, but my suggestion is to run allequal
first and raise only a warning if false, only then to run allclose
as a way of detecting issues
I do not see the benefit to that. since check_env
makes operations deterministically.
Proposal
Hey, one issue in Gymansium-Robotics was discovered, which could have been avoided if
check_env
was improvedCritique 1
utils.check_env.data_equivilance
makesall_close
comparisons, instead of equality comparison, Line 57 at: https://github.com/Farama-Foundation/Gymnasium/blob/806d8afe9358d9ff2b092f4a4d7ccf0b320690a8/gymnasium/utils/env_checker.py#L33-L63proposed solution
add an
exact
argument todata_equivilance
and it should be used asTrue
whenenv.spec.nondeterministic=False
.Note: all
gymnasium
&gymnasium-robotics
environments pass withexact=True
Critique 2
It does not test that
env.reset()
properly resets the internal state, it just checks that initial observation is the sameproposed solution
for now, we can not have a solution that works across environments
Critique 3
It does test
env.step()
proposed solution
add a simple
check_step
test function, which assumes thatreset(seed)
is deterministic (tested before running this function) and asserts that step is deterministic.Motivation
No response
Pitch
No response
Alternatives
No response
Additional context
No response
Checklist