imbue-ai / slot_attention

Apache License 2.0
92 stars 17 forks source link

Is it a bug that `slot_mu` and `slot_log_sigma` is not updated in training? #8

Closed Wuziyi616 closed 3 years ago

Wuziyi616 commented 3 years ago

Hi. Thank you for opening source this wonderful implementation! I have a small question about a code and think it might be a bug.

In these lines, you define slot_mu and slot_log_sigma using register_buffer. If I understand correctly, tensors created via register_buffer won't be updated during training (see here for reference). I also check my trained checkpoints, these two values are indeed the same throughout the training process.

Also, in other slot-attention implementations, they define them as trainable parameters (see PyTorch one and the official one). So I just wonder if this is a bug or intentional behavior?

Update: I didn't observe much performance difference using trainable or fixed mu+sigma. That's very interesting.

joshalbrecht commented 3 years ago

I think you are correct that this was a bug, and not an intentional change. Thanks for flagging!

Wuziyi616 commented 3 years ago

Indeed you're right. Actually I run experiments after fixing it, and the performance difference is very small (<5%). So I think the learned slot initialization distribution is not very important.