shariqiqbal2810 / maddpg-pytorch

PyTorch Implementation of MADDPG (Lowe et. al. 2017)
MIT License
552 stars 128 forks source link

Implementing request: Simple reference & move and communicate ability? #37

Open tessavdheiden opened 3 years ago

tessavdheiden commented 3 years ago

Hi Shariq,

You've (unintentionally?) shut-off the shared_reward, which is why simple_reference, in which agents both move and speak will not work.

self.shared_reward = world.collaborative if hasattr(world, 'collaborative') else False

Moreover, it requires actions to be an instance of the class MultiDiscrete in gym.spaces libary. Maybe you need to use this as well (environment.py): if all([isinstance(act_space, spaces.Discrete) for act_space in total_action_space]): act_space = spaces.MultiDiscrete([act_space.n for act_space in total_action_space]) Shall I make a pull-request with the changes? 0_5

shariqiqbal2810 commented 3 years ago

Hi Tessa,

There is a bit more work that would go into supporting MultiDiscrete action spaces. Specifically the policy architecture would need a separate head for each discrete sub-action, and all the places in the training code that deal with actions would likely have to be modified to support it. If you'd like to do so, then I'm happy to accept a pull request!

tessavdheiden commented 3 years ago

Hi Shariq,

The easiest solution is to consider either a communication action or a movement at the same time step. That’s how I implemented it now. Both are discrete.

I think it would be better to have separate heads, that would also support continues and discrete actions.

But that would mean a proper software design choice. If u want, we can brainstorm about it?

I also understand, that you are occupied and not working on the project anymore ;-).

I’m now working with Wendelin, I think you know each other?

I’d like to have a video call, you also?

Best, Tessa

Op woensdag 2 juni 2021 schreef Shariq Iqbal @.***>:

Hi Tessa,

There is a bit more work that would go into supporting MultiDiscrete action spaces. Specifically the policy architecture would need a separate head for each discrete sub-action, and all the places in the training code that deal with actions would likely have to be modified to support it. If you'd like to do so, then I'm happy to accept a pull request!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/shariqiqbal2810/maddpg-pytorch/issues/37#issuecomment-853284728, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF6IQSI7ZM7MOBOAWQGMIE3TQZ2GZANCNFSM456GCGQQ .

-- Tessa van der Heiden T: +49 151-60136220 E: @. | @.

Augustenstraße 67, 4 80333 München Germany

Github https://github.com/tessavdheiden Facebook https://www.facebook.com/tessa.vanderheiden.10 Linkedin https://www.linkedin.com/in/tessa-van-der-heiden-a1291036/ Skype https://join.skype.com/invite/lDmZEbnpwHKL

shariqiqbal2810 commented 3 years ago

Unfortunately, I don't really have any bandwidth to work on this anymore, as I wrote this code over 3 years ago haha. Glad to hear you're working with Wendelin! He's a great collaborator :)