QUVA-Lab / e2cnn

E(2)-Equivariant CNNs Library for Pytorch
https://quva-lab.github.io/e2cnn/
Other
596 stars 75 forks source link

Larger memory consumption and slower training speed #38

Closed EBGU closed 3 years ago

EBGU commented 3 years ago

Hi!

First, I really appreciate your project, and it is very helpful in my project. However, I was a bit confused when I tried to compare vanilla resnet18 with e2resnet18. I found e2resnet18 consumed 4 times more GPU memory and 10 times training time. I wonder if I had done anything wrong, or it is just designed like that. Thank you very much!

Best, Harold

Gabri95 commented 3 years ago

Hi @EBGU

Are you referring to this model? e2wrn.py

This is the case if fixparams=True. Because equivariance induces a stronger weight sharing, an equivariant model has usually less parameters than an equivalent conventional model. For this reason, it is common to compare to a scaled up version of the equivariant model which has the same number of parameters of the conventional model. This results in a wider model (roughly by a factor of sqrt(N), where N is the group size).

You can build a non-scaled up model by setting fixparams=False. This results in a model with more or less the same size of the conventional architecture.

Note that in our paper we compare the conventional model with both equivariant models above (fixparams=False and fixparams=True). The README reports a summary of the results here.

I hope this answers your question

Best, Gabriele Cesa