Open durach opened 11 months ago
Custom ResNets in easyfsl come from this implementation which is now a not-so-recent fork of PyTorch's ResNet, and it is highly possible that its memory usage is suboptimal.
Quick response to this problem would be to make this clear this in our custom ResNet's docstring.
Better response would be to start a deep study of the differences between this implementation and PyTorch's and find how to improve our memory usage.
Best response would probably be to reimplement our custom ResNet to extend PyTorch's and reduce to the minimum the differences between the two.
Note that the last two options could cause unexpected shifts between the results obtained with easyfsl
and the other works that use FiveAI's implementation, but it's probably worth it: easyfsl
is meant to improve best practices in the field.
Thank you. I'll discover both architectures and come up with improvements if I have such. For now, I've switched to the native PyTorch implementation; it eats less memory and is way faster, at least for my case.
Should I close the issue?
No it's definitely an active issue and deserves to be addressed.
Description
The issue could be related to #116. I am adapting the Prototyping Networks for my case. I noticed that you're using an adjusted version of ResNet in your examples. Based on my experiments, this version is not a direct replacement for standard torch ResNet implementation. At least it consumes more GPU memory in the same circumstances.
How To Reproduce
Output:
The same code with SIZE=5 or with "native" ResNet Module doesn't cause issue:
Additional context The code above was executed on
I tried Google Colab, and with T4 16Gb, I reached batch sizes of 64 and 512 with EastFSL and Torch version of ResNet18.
I am a very beginner in ML and may misuse the framework.