PingoLH / FCHarDNet

Fully Convolutional HarDNet for Segmentation in Pytorch
MIT License
195 stars 52 forks source link

What is the difference between HarDBlock and HarDBlock_v2? #33

Open DonghweeYoon opened 4 years ago

DonghweeYoon commented 4 years ago

In the source code, only 'validate.py' uses HarDBlock_v2. And the model with HarDBlock_v2 is faster than that with HarDBlock. I want to know the difference between v1 and v2.

PingoLH commented 4 years ago

Hi, thank you for the feedback. They are equivalent in math while HarDBlock_v2 reduces the use of concat such that it is a little bit faster. It first decomposes convolutions from "many to one" to "one by one", then merges the convolutions with the same input tensor. For example: X = Conv ( Concat( [A, B] ) ) Y = Conv ( Concat( [A, C] ) )

In HarDBlock_v2, it will be like: Z = Conv ( A ) // the shape of Z is concat([X,Y]) X = Conv ( B ) Y = Conv ( C ) X += Z[ 0 : X.shape(1) ] Y += Z[ X.shape(1) : ]

The concatenation can be totally reduced while the block level output concat is still required. It can be faster than the original one because the concatenation involves memory copy which is time-consuming. However, this transformation is not free. The "+=" part requires additional memory accesses as well but just minor than the concatenation. Also, the training time for HarDBlock_v2 is slower than the original one. So, we still urge that pytorch and tensorRT to support convolutions for "discontinuous tensor" such that the concat can be just a pointer-wise operation without any memory copy.