Performance issues in the definition of conv_2d, src/util.py

Hello, I found that in the definition of conv_2d, src/util.py, kernel_initializer will be created if no initializer is given explicitly. Moreover, conv_2d is called repeatedly in down_sampling, here. To avoid recreating rebundant nodes in tf computation graph, I think the kernel_initializer should be created before the loop in down_sampling, and pass it to conv_2d.

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

Hi @JamesCao2048 ,

Sorry for the late reply that I was too busy recently and didn't notice this issue.

For the problem that you mentioned, I think that down_sampling part is necessary. In HRNet, there are x2, x4, x8 times down-sampling. You will need to write them separately if you don't use the while-loop. What's more, kernel_initializer in each conv_2d layer should be different and that's why I didn't create it and pass it to conv_2d.

That's my personal opinion and not sure whether I am correct.

Thanks again.

VXallset / deep-high-resolution-net.TensorFlow

Performance issues in the definition of conv_2d, src/util.py #25