rahafaljundi / MAS-Memory-Aware-Synapses

Memory Aware Synapses method implementation code
94 stars 20 forks source link

Weight difference not squared #3

Closed gyglim closed 5 years ago

gyglim commented 5 years ago

According to the paper the weight difference is squared when computing the loss, cf. Eq. (3)

In the code however it looks like it's just the difference: https://github.com/rahafaljundi/MAS-Memory-Aware-Synapses/blob/c3e6a855cdde588fb74aeb876f84340eb6090ad5/MAS_to_be_published/MAS_utils/MAS_based_Training.py#L80

That would mean that a negative difference would lead to a negative penalty! Is that a bug or am I missing something?

rahafaljundi commented 5 years ago

Hi Michael,

The difference that I have in my code is the gradient of the L2.

Best, Rahaf

On Tue, 15 Oct 2019, 22:04 Michael Gygli, notifications@github.com wrote:

According to the paper the weight difference is squared when computing the loss, cf. Eq. (3)

In the code however it looks like it's just the difference:

https://github.com/rahafaljundi/MAS-Memory-Aware-Synapses/blob/c3e6a855cdde588fb74aeb876f84340eb6090ad5/MAS_to_be_published/MAS_utils/MAS_based_Training.py#L80

That would mean that a negative difference would lead to a negative penalty! Is that a bug or am I missing something?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/rahafaljundi/MAS-Memory-Aware-Synapses/issues/3?email_source=notifications&email_token=AESJC3CS2TQZTT353TTSXHDQOYO5RA5CNFSM4JBB3NLKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HR7EP2Q, or unsubscribe https://github.com/notifications/unsubscribe-auth/AESJC3EO2HCHNQYCET5WQYLQOYO5RANCNFSM4JBB3NLA .

gyglim commented 5 years ago

Hi Rahaf

Ah, I see, you directly compute gradient of the loss, not the loss. Got it :). Thanks for the clarification

Cheers, Michael