Open ronTohar1 opened 1 year ago
I just committed some changes, please let me know if it has resolved the issue, if not, please give me a minimal minimal reproducible example of the issue :)
It did fix my problem, but I did have to do some changes to the step function (regardless of the current new commit) to make the code work.
I changed the step()
function
from
def step(self, raw_action: Union[int, tuple]) -> Tuple[np.ndarray, int, bool, dict]:
if isinstance(raw_action, int):
To
def step(self, raw_action: Union[int, tuple]) -> Tuple[np.ndarray, int, bool, dict]:
if isinstance(raw_action, int) or isinstance(raw_action, np.int32) or isinstance(raw_action, np.int64):
This change had been made way before the aforementioned problem arose.
For example, the gym environment checker didn't pass before the change to the step()
function, as well as the following piece of code:
import gymnasium
import gym_battleship
env = gymnasium.make("GymV21Environment-v0", env_id="Battleship-v0")
obs,_ = env.reset()
done = False
while not done:
action = env.action_space.sample()
obs, reward, terminated, truncated, info = env.step(action)
done = terminated or truncated
env.render()
This happened because for some reason, sample()
of the action apce gave an numpy.int64 value. (This error is also the reason the env_checker didn't 'approve' the environment)
P.S I just can't understand why you'd put an np.float32 in the dtype as it is only integers, but I guess it doesn't make any difference.
Ok, I'll take a closer look for fixing the code when I have time
Please let me know if there are any other thing not so right that you notice in the code :)
For the observation space, I remember that there were a clear reason why I did that and I was quite upset to have to switch to float instead of int. However I can't remember now. I may switch back to int and see if someone experience an issue.
Thank you for your message :)
hey, I noticed there is a problem in battleships.py where the observation space is of type
int
and the returned observation in reset or step are of type np.float32. This causes a problem with env_checker of gym of course and therefore also with the use of RL libraries such as SB3 as they use this information and it creates conflicts.