[feature] add SKRL example

edbeeching / godot_rl_agents

An Open Source package that allows video game creators, AI researchers and hobbyists the opportunity to learn complex behaviors for their Non Player Characters or agents

MIT License

974 stars 70 forks source link

[feature] add SKRL example #138

Open edbeeching opened 1 year ago

edbeeching commented 1 year ago

The skrl lib looks interesting, particularly as they just added multi-agent support. It would be good to add support and an example.

https://skrl.readthedocs.io/en/latest/

Ivan-267 commented 1 year ago

Sounds interesting, especially the multi-agent support. Might need some plugin and env modifications to differentiate the different agent types, and if making an example using MAPPO, some suitable/cooperative task idea.

Ivan-267 commented 1 year ago

https://github.com/edbeeching/godot_rl_agents/assets/61947090/bc4146c6-8e78-435b-b9a3-900d453f1b9f

Sharing a small test case from yesterday based on the examples from SKRL. It's PPO with num_agents set to 1 and only one observation space used, not a proper/full implementation by any means nor a test of the learning performance, just a first attempt to get the framework to start training with Godot-RL environment.

For a full implementation with multi-agents with separate policies, observations, etc., more changes are needed.

edbeeching commented 1 year ago

Cool. It seems their MARL support is quite limited at the moment. Perhaps we can focus on other things until they have better support?

Ivan-267 commented 1 year ago

Sure, we could work on a potential more complete implementation at any point. Just asking since I'm not that familiar with any MARL implementation yet and not sure what to expect, are there some useful features that are not supported yet?

Regarding the support on our side, I was only just starting to consider what might be needed. Assigning a name for each AIController and then in sync.gd having a dictionary with all unique names as keys and all agent instances with that name as values is about as far as I got on the Godot side, but whether that's a good approach and the other details could be considered more when/if we start working on a more proper implementation.