Can anybody explain this code to me??

tsinghua-rll / VoxelNet-tensorflow

A 3D object detection system for autonomous driving.

MIT License

453 stars 123 forks source link

Can anybody explain this code to me?? #52

Open sainisanjay opened 5 years ago

sainisanjay commented 5 years ago


with tf.variable_scope('MiddleAndRPN_' + name):
            # convolutinal middle layers
            temp_conv = ConvMD(3, 64, 64, 3, (2, 1, 1), (1, 1, 1), temp_conv, name='conv3')
            temp_conv = tf.transpose(temp_conv, perm=[0, 2, 3, 4, 1])
            temp_conv = tf.reshape(temp_conv, [-1, cfg.INPUT_HEIGHT, cfg.INPUT_WIDTH, 128])

1) Since after VFE layer we will get 4D feature map. Than how we are reshaping to 3D?? @lengly @ring00 @abhigoku10 @jeasinema

gdicker1 commented 4 years ago

When you use a 2-d convolution, you operate on a 3-d tensor (e.g. a color image input with a certain width and height has a depth of 3 for the 3 color channels). The 3-d convolution expects a 4-d tensor or it would fail the same way a 2-d convolution would if it only got a 2-d tensor.