Cadene / murel.bootstrap.pytorch

MUREL (CVPR 2019), a multimodal relational reasoning module for VQA
https://arxiv.org/abs/1902.09487
BSD 3-Clause "New" or "Revised" License
194 stars 24 forks source link

Why is it important to detach the tensor and stop gradient propagation in pairwise.py #7

Closed yuweihao closed 5 years ago

yuweihao commented 5 years ago

Thanks for sharing the nice code!

When reading the nice code, I am confused about this. https://github.com/Cadene/murel.bootstrap.pytorch/blob/0e6cfc2415416c420bf0e6fe1614eec5692c26c1/murel/models/networks/pairwise.py#L68

My question is in MuRel Cell, why it is important to detach the tensor and stop propagation here.

Cadene commented 5 years ago

@yuweihao

We developed on pytorch0.3. We tried to port our code to pytorch0.4/1.1, but it was 3 times slower because of an issue with the indexing. We didnt have much time so we desactivated the gradients (detach)... Unfortunately it was a really bad idea. We just fixed it: https://github.com/Cadene/murel.bootstrap.pytorch/commit/7c9eaebfa6b0fe2565d97dac01001ea9e6ddae7b

See https://github.com/Cadene/murel.bootstrap.pytorch/issues/15 for more info

yuweihao commented 5 years ago

Hi @Cadene

Thank you very much for your reply.