What is the difference between HarDBlock and HarDBlock_v2?

Hi, thank you for the feedback. They are equivalent in math while HarDBlock_v2 reduces the use of concat such that it is a little bit faster. It first decomposes convolutions from "many to one" to "one by one", then merges the convolutions with the same input tensor. For example: X = Conv ( Concat( [A, B] ) ) Y = Conv ( Concat( [A, C] ) )

In HarDBlock_v2, it will be like: Z = Conv ( A ) // the shape of Z is concat([X,Y]) X = Conv ( B ) Y = Conv ( C ) X += Z[ 0 : X.shape(1) ] Y += Z[ X.shape(1) : ]

The concatenation can be totally reduced while the block level output concat is still required. It can be faster than the original one because the concatenation involves memory copy which is time-consuming. However, this transformation is not free. The "+=" part requires additional memory accesses as well but just minor than the concatenation. Also, the training time for HarDBlock_v2 is slower than the original one. So, we still urge that pytorch and tensorRT to support convolutions for "discontinuous tensor" such that the concat can be just a pointer-wise operation without any memory copy.

PingoLH / FCHarDNet

What is the difference between HarDBlock and HarDBlock_v2? #33