facebookresearch / hanabi_SAD

Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning
Other
96 stars 35 forks source link

Why is the last action added twice to the observation? #33

Closed ravihammond closed 1 year ago

ravihammond commented 1 year ago

Hi Hengyuan!

When you encode the observation for a SAD agent, I noticed that the last action is being added twice. It's first added here when Encode() is initially called, and added a second time here if the sad_ flag is set.

Why are you adding the last action to the encoding twice? Many thanks in advance for your advice!

hengyuan-hu commented 1 year ago

That’s how SAD works, always telling the partner its greedy action even when it is exploring.

On Nov 21, 2022, at 8:58 PM, Ravi Hammond @.***> wrote:

Hi Hengyuan!

When you encode the observation for a SAD agent, I noticed that the last action is being added twice. It's first added here https://github.com/hengyuan-hu/hanabi-learning-environment/blob/273c5ca7f583f178c63693ab58161ca7887f3a89/hanabi_lib/canonical_encoders.cc#L678 when Encode() is initially called, and added a second time here https://github.com/facebookresearch/hanabi_SAD/blob/415804b531447bb4b8adb12100f994d588589cd8/cpp/hanabi_env.cc#L157 if the sad_ flag is set.

Why are you adding the last action to the embedding twice? Many thanks in advance for your advice!

— Reply to this email directly, view it on GitHub https://github.com/facebookresearch/hanabi_SAD/issues/33, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABECKZM4WGVTOPJLOHIAZ33WJRHHRANCNFSM6AAAAAASHLZUME. You are receiving this because you are subscribed to this thread.

ravihammond commented 1 year ago

Ah yes, makes sense. Thanks for that!