Closed bokveizen closed 3 years ago
It was very much not optimised for memory. You may want to play around with the tensors and inplace operations. Note that flattened_orthogonal
performs quite a few operations operations that create copies (flatten
sometimes makes copies) on a large tensor, even if it's just once per forward pass.
I am afraid that I cannot help you any more on this one. You'll have to profile the memory usage of the operations and try to avoid copies.
In general, I have not profiled geotorch
to have a small memory footprint. You may want to play around with the implementation a bit if you encounter these problems in the future. If you manage to reduce the footprint of a particular parametrisation, feel free to submit a PR!
Thanks so much!
10 I just tried this on WideResNet 28-10 which originally needs ~4.5GB GPU memory, but after using flattened_orthogonal,