Open IwakuraRein opened 3 years ago
I wrap the structures behind the featureEncoder
with with tf.control_dependencies([feature]):
and now the timeline result seems fine. It's nearly the same as sbnet_module.cuda_timer's result.
However, the time cost of the featureEncoder
increases heavily. My input is (720, 1280, 7). The original network spends roughly 38ms, where the featureEncoder
takes up about 10ms. I want to reduce the inference time to less than 33ms. After wrapping the featureEncoder
with SparseScatter and SparseGather, the network's inference time comes to 44ms with all '1' in the mask.
When I feed the mask of zero values, strange happens. When the sparsity comes to nearly 0.1, the time rises to 150ms. Convolutions under the featureEncoder
become discrete pieces shown in the timeline chart. The time is 90ms when the sparsity goes to 0.5 and 64ms with 0.8.
I checked the issue. I've tried many block sizes and sparsity but still seeing no improvement. Firstly I guess the problem is because the sparse convolution reduces the GPU memory usage, causing it lazy. But since your experiment used GTX1080ti, I think the method works well on powerful GPUs.
I must have misunderstood something and made serious mistakes. Hope to receive answers. Thanks.
Hi. Thanks for the codes and the detailed instruction.
I implemented sparse convolution into my encoder:
self.training
is setFalse
when training andTrue
when testing. Variablemask
is generated outside the network and fed in viatf.placeholder
. So doesself.lastFeature
.I tried to measure the inference time with timeline:
However, I can't find time records of layers under 'featureEncoder'. And there are two bars captioned unknown, the second of which is strangely long. Some Pooling and LeakyRelu‘s time is also strange, costing nearly 2ms.
I wonder how I can get the proper time measurement. Thanks.
My Environment TensorFlow Version: 1.15.0 Operating System: Ubuntu 16.04 Python Version: 3.6.13 CUDA Version: 10.0 CUDNN Version: 7.6.4 GPU Type: RTX 2080ti Nvidia Driver Version: 460.67