Farama-Foundation / Gymnasium

An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
https://gymnasium.farama.org
MIT License
6.34k stars 731 forks source link

[Bug Report] Box casting during init can cause invalid sampled values #768

Closed Jammf closed 4 months ago

Jammf commented 8 months ago

Describe the bug

During init, Box bounds are captured before casting to the destination dtype, but aren't checked again after the cast.

When Box casts to a lower-precision float with a range smaller than the source dtype, for example Box(0, 80000, dtype=np.float16), this can cause high and low to contain np.inf values without the corresponding bounded_above or bounded_below being False. This leads to an error when calling sample().

When Box casts np.inf from a float to an int, for example Box(255, np.inf, dtype=np.uint8), samples can under/overflow and be outside the Box bounds -- #328 seems to be related to this one.

Code example

>>> from gymnasium.spaces import Box
>>> import numpy as np
>>> np.finfo(np.float16).max
65500.0
>>> b = Box(0, 80000, dtype=np.float16)
<snip>/python3.11/site-packages/numpy/core/numeric.py:330: RuntimeWarning: overflow encountered in cast
  multiarray.copyto(a, fill_value, casting='unsafe')
>>> b
Box(0.0, inf, (1,), float16)
>>> b.bounded_below
array([ True])
>>> b.bounded_above
array([ True])
>>> b.sample()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<snip>/python3.11/site-packages/gymnasium/spaces/box.py", line 228, in sample
    sample[bounded] = self.np_random.uniform(
                      ^^^^^^^^^^^^^^^^^^^^^^^
  File "_generator.pyx", line 1043, in numpy.random._generator.Generator.uniform
OverflowError: Range exceeds valid bounds
>>>
>>>
>>> ### New session
>>> 
>>> from gymnasium.spaces import Box
>>> import numpy as np
>>> b = Box(255, np.inf, dtype=np.uint8)
<snip>/python3.11/site-packages/numpy/core/numeric.py:330: RuntimeWarning: overflow encountered in cast
  multiarray.copyto(a, fill_value, casting='unsafe')
>>> b
Box(255, 255, (1,), uint8)
>>> b.bounded_below
array([ True])
>>> b.bounded_above
array([False])
>>> b.sample()
array([2], dtype=uint8)  # overflow!

System info

Additional context

No response

Checklist

Kallinteris-Andreas commented 8 months ago

@Jammf can you make a PR https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/spaces/box.py#L55

pseudo-rnd-thoughts commented 8 months ago

My intuition is to raise an error within __init__ if either the low or high are outside of the dtype's possible values. @RedTachyon @Jammf Do you agree? If so, could we make a PR that adds this?

Jammf commented 8 months ago

Sure, I can make a PR. I think raising an error makes sense, but it'd also mean that integer boxes could no longer be unbounded (since ints can't be inf). Is that what we'd want?

Also, is that case, it seems like it'd have a lot of overlap with a MultiDiscrete space, so I'd wonder what the benefit of having both integer Box and MultiDiscrete would be (and the same with binary Box and MultiBinary). But, Box does also have a nice way to specify identical limits for every axis, so it has that going for it.

pseudo-rnd-thoughts commented 8 months ago

We could treat inf as a special case to ignore as the max value to test are within the dtype's bound I think is the simplest approach to perverse backward compatibility

pseudo-rnd-thoughts commented 4 months ago

Fixed in #774