facebookresearch / hanabi_SAD

Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning
Other
97 stars 35 forks source link

greedy_extra argument #2

Closed alexis-jacq closed 4 years ago

alexis-jacq commented 4 years ago

Hi, I am trying to catch the difference between VDN and SAD.

From the paper, I understand that SAD is sharing the same architecture than VDN, but is passed the greedy actions of opponents as inputs for centralized training instead of actual environment actions (case of VDN).

I deduced that is this code, greedy_extra is the key argument that decides if it will be the VDN or the SAD approach during training. Is this right ?

hengyuan-hu commented 4 years ago

Yes you are right.

On Thu, Jan 2, 2020 at 4:17 AM Alexis David Jacq notifications@github.com wrote:

Hi, I am trying to catch the difference between VDN and SAD.

From the paper, I understand that SAD is sharing the same architecture than VDN, but is passed the greedy actions of opponents as inputs for centralized training instead of actual environment actions (case of VDN).

I deduced that is this code, greedy_extra is the key argument that decides if it will be the VDN or the SAD approach during training. Is this right ?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/facebookresearch/hanabi_SAD/issues/2?email_source=notifications&email_token=ABECKZIM36Z7VICEBVE5Q3DQ3TT4JA5CNFSM4KB4MO7KYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IDSXEPA, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABECKZIYH3FFX5D6GHYO4LTQ3TT4JANCNFSM4KB4MO7A .