Open rytisss opened 3 years ago
Yes! This is is a known issue, at least for me.
For example, for a feature map of size 512
, kernel size of 3
and 128
features, you have to maintain 512*512*128*3*3*2 = 603979776
indices and collect the convolution inputs. I couldn't find an obvious way to reduce the memory footprint.
Setting num_deformable_group
to a small value (1, 2, 4, ...) should reduce memory consumption. It is also not common to replace all the layers in a Conv2D
architecture by DeformableConv2D
. In the code related to the papers, they typically add or replace one or two layers with DeformableConv2D
.
The implementation goes back to https://github.com/DHZS/tf-deformable-conv-layer and is not a native CUDA kernel (https://github.com/tensorflow/addons/issues/179), which may be more memory efficient.
Thank you for a quick response! I found a few research paper https://arxiv.org/abs/1811.01206 , https://arxiv.org/pdf/2007.01001.pdf authors stating that the Conv2D was replaced by DeformableConv2D, probably that was misleading for me :) (also regarding my hardware). I will try to reduce num_deformable_group
. Thank you again for your help!
Hi, thank you for your implementation. I tried to change 8 Conv2D layers with DeformableConv2D and during memory allocating (model build), GPU soon runs out of memory (8GB). Although with a small number of filters everything seems to be good. Is the DeformableConv2D operation not very memory efficient?