edbeeching / godot_rl_agents

An Open Source package that allows video game creators, AI researchers and hobbyists the opportunity to learn complex behaviors for their Non Player Characters or agents
MIT License
902 stars 63 forks source link

fix dtype of image data #158

Closed Ivan-267 closed 7 months ago

Ivan-267 commented 9 months ago

Fix for handling image data received from Godot causing errors such as: https://github.com/edbeeching/godot_rl_agents_examples/pull/14#issuecomment-1837311171

Tried with Godot 4.2, with training using gdrl.

With this and the plugin side fix, training should work.

Exporting to onnx is currently not supported, we could see if this can be addressed and how (may need the correct preprocessing) in future updates. Mixing image and vector data (multiple spaces) is untested.

Ivan-267 commented 9 months ago

I've also removed the conversion to float since having it in uint8 may activate normalization which could result in better training (didn't check yet if we need to change something elsewhere in the space dtype):

https://github.com/DLR-RM/stable-baselines3/blob/master/stable_baselines3/common/preprocessing.py#L45

This works with sb3 using gdrl, but I haven't tested the other frameworks.

Ivan-267 commented 8 months ago

While this seems to work, I'll look into this a bit more, it would be helpful if the sb3's image normalization also gets applied (if it's not currently).

This is possibly / probably why it was float initially:

https://github.com/edbeeching/godot_rl_agents/blob/55e375be0d8a2f2c90f371a5347b285ec6f2abf8/godot_rl/core/godot_env.py#L336-L342

Maybe we need to add an exception for the images.

Gymnasium box does support integers (from https://gymnasium.farama.org/api/spaces/fundamental/#box):

dtype – The dtype of the elements of the space. If this is an integer type, the Box is essentially a discrete space.

So maybe a box from 0 to 255 if the name of the space includes 2d (as is currently done for parsing the image obs)? This is undocumented for now, but we can document this behavior at some point along with other tips for using sensors.

The code from sb3 I linked before seems to suggest a check for box space and then for dtype of np.uint8.

However I'm not sure if this would break something for any of the frameworks that we're using.

Ivan-267 commented 8 months ago

I did some testing, and it seems that after the last commit, SB3 will use CNN for images (and since it accepts the input as image, it will likely normalize them). This now works both with the MultiInputPolicy or with CnnPolicy if SBGSingleObsEnv with the obs_key set to "camera_2d" is used.

The space follows the requirement from sb3 custom environment docs:

If you are using images as input, the observation must be of type np.uint8 and be contained in [0, 255]. By default, the observation is normalized by SB3 pre-processing (dividing by 255 to have values in [0, 1]) when using CNN policies. Images can be either channel-first or channel-last.

However, there was an error with the virtual camera example, due to the size of the image. 36 x 36 seems to be the minimum supported size by default. Sb3 has a useful env checker for these things, however, we can't directly apply it to our env since it doesn't inherit from the Gymnasium space that the checker expects.

Relevant quotes from the code:

    if observation_space.shape[non_channel_idx] < 36 or observation_space.shape[1] < 36:
        warnings.warn(
            "The minimal resolution for an image is 36x36 for the default `CnnPolicy`. "
            "You might need to use a custom features extractor "
            "cf. https://stable-baselines3.readthedocs.io/en/master/guide/custom_policy.html"
        )
    assert isinstance(
        env, gym.Env
    ), "Your environment must inherit from the gymnasium.Env class cf. https://gymnasium.farama.org/api/env/"

Since we don't have a warning here, I experimented with adding some changes to the plugin instead (to assert that the resolution is at or above the minimum expected by CNN).

Here is the relevant plugin PR: https://github.com/edbeeching/godot_rl_agents_plugin/pull/31

Ivan-267 commented 7 months ago

I will merge this as I've merged the plugin update, so that the CNN network works. If any issues appear, we can address them in a future PR.