Feature request: RL libraries integration

elliottower commented 1 year ago

Hi, I was wondering if it would be possible to integrate chat arena with something like gymnasium or pettingzoo (multi agent version of gym). Would be very interesting to see LLMs be used both for language envs and for regular envs like poker, connect four etc.

yuxiang-wu commented 1 year ago

Thank you for the feature request. At the moment (Apr 2023) ChatArena dev team focuses on providing a diverse set of language game environments and LLM backends. Since our abstraction and APIs are similar with gymnasium or pettingzoo, theoretically users could use their favourite RL libraries to train their models in ChatArena (for example, use the Arena.save_history method to export the game rollout for offline RL).

We welcome community contribution to support RL libraries integration :)

elliottower commented 1 year ago

Thank you for the feature request. At the moment (Apr 2023) ChatArena dev team focuses on providing a diverse set of language game environments and LLM backends. Since our abstraction and APIs are similar with gymnasium or pettingzoo, theoretically users could use their favourite RL libraries to train their models in ChatArena (for example, use the Arena.save_history method to export the game rollout for offline RL).

We welcome community contribution to support RL libraries integration :)

Thanks for the response. I’d love to help out and adapt the code to use pettingzoo or at least have a wrapper class to adapt it to be compatible, I think it would be really powerful being able to use RL frameworks like RLlib directly with the environments.

Would one of those options be acceptable? My thinking is the lower level API integration is better than a specific training library, but I’m not sure if changing the underlying logic would be an issue for future compatibility, so maybe it’s best to do a wrapper?

yuxiang-wu commented 1 year ago

Thanks for offering contribution! I agree that it would be really powerful if users can do RL training with ChatArena. But I think it is also a good idea to decouple environment and RL frameworks, so that everyone can use their favourite RL library (CleanRL, Tianshou, RLlib, etc.) or their own RL implementation.

Maybe one first step is to provide a notebook or tutorial (like this one from pettingzoo) to show case how to use the environment with other RL packages. For further integration, wrapper may be a good idea.

elliottower commented 1 year ago

Thanks for offering contribution! I agree that it would be really powerful if users can do RL training with ChatArena. But I think it is also a good idea to decouple environment and RL frameworks, so that everyone can use their favourite RL library (CleanRL, Tianshou, RLlib, etc.) or their own RL implementation.

Maybe one first step is to provide a notebook or tutorial (like this one from pettingzoo) to show case how to use the environment with other RL packages. For further integration, wrapper may be a good idea.

Makes sense. To create a tutorial like that would require a wrapper class to adapt the API to match but I can see how that would be nice to show with a full example so users can get the most out of it. Would there be any interest in a conversation wrapper being included for this library in Shimmy? We have made wrappers for popular external RL envs like deepmind control, would love to have another multi agent env and such a unique use case. But maybe it would be more ideal to integrate directly in this codebase with a PR so that integration/tutorials could be tested and backwards compatibility can be preserved. We don’t currently offer training tutorials on Shimmy either so that would need to be on this repository.

yuxiang-wu commented 1 year ago

Both having a wrapper in Shimmy and integrating into this codebase with a PR make perfect sense. The latter would offer better backward compatibility, and since we are planning to add more environments to the library, we have more interest in the latter. But if you find writing a wrapper for this library in Shimmy easier, we encourage you to go for it. :)

elliottower commented 1 year ago

Is this already being worked on internally? I saw an example using pettingzoo chess but don’t see pettingzoo or gymnasium appearing anywhere in the repo.

If it’s not being done already I’d love to help out, I was wondering if there’s a development discord or slack? Otherwise I can put questions here. Would like to make sure I’m implementing it in a way that makes sense to you guys.

Apologies for the delay, I’ve been working on releases for other libraries the last few weeks.

yuxiang-wu commented 1 year ago

Thank you for getting back!

Regarding the first question: We ported the chess environment from Pettingzoo here: https://github.com/chatarena/chatarena/blob/main/chatarena/environments/pettingzoo_chess.py
We created a slack channel here: https://join.slack.com/t/chatarena/shared_invite/zt-1t5fpbiep-CbKucEHdJ5YeDLEpKWxDOg

I am happy to discuss here or in slack for push forward the development in this direction.

elliottower commented 1 year ago

Finally got a chance to look through the code to see how difficult it would be to make a PettingZoo wrapper around an entire Chat Arena environment (making a general utility to load pettingzoo or for example openspiel text board game environments into chat arena would be very fruitful as well—letting LLMs play battleship or simple poker, etc).

It looks like the step() calls are very similar, the messages could be included in the observations, and backends and configs could be environment parameters. The parallel ticker on the huggingface demo seems very similar to the parallel environment type we have in pettingzoo as well. And the handling of agents could definitely be done with pettingzoo as well. Gymnasium has a text action/observation space type which can be used for any length sequence like are used in the messages, so I think everything should be possible.

The codebase could be refactored to use PettingZoo as a framework to handle the different agents, but I don’t want to start working on that before getting confirmation, as obviously it would be a sizable code modification. Alternatively, a wrapper could be written first which makes it compatible using the same underlying codebase as is currently used.

elliottower commented 1 year ago

I’m also curious if there are any plans to add langchain support as a backend, to allow users to build on top of models and test more complex chat bots for example.

yuxiang-wu commented 1 year ago

I’m also curious if there are any plans to add langchain support as a backend, to allow users to build on top of models and test more complex chat bots for example.

We have plan to support langchain backend. We will release our dev roadmap soon.

Finally got a chance to look through the code to see how difficult it would be to make a PettingZoo wrapper around an entire Chat Arena environment (making a general utility to load pettingzoo or for example openspiel text board game environments into chat arena would be very fruitful as well—letting LLMs play battleship or simple poker, etc).

It looks like the step() calls are very similar, the messages could be included in the observations, and backends and configs could be environment parameters. The parallel ticker on the huggingface demo seems very similar to the parallel environment type we have in pettingzoo as well. And the handling of agents could definitely be done with pettingzoo as well. Gymnasium has a text action/observation space type which can be used for any length sequence like are used in the messages, so I think everything should be possible.

The codebase could be refactored to use PettingZoo as a framework to handle the different agents, but I don’t want to start working on that before getting confirmation, as obviously it would be a sizable code modification. Alternatively, a wrapper could be written first which makes it compatible using the same underlying codebase as is currently used.

It will be great if you can make a Pettingzoo wrapper so that environments in pettingzoo can be loaded as CHatArena environments. One challenge we found was that we have to manually translate the game state and action space into natural language for LLM backends in ChatArena, and vice versa. I am not sure if there is a more elegant way to do it so that we could create a general "translation" layer between the two.

Farama-Foundation / chatarena

Feature request: RL libraries integration #14