charlesq34 / pointnet

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
Other
4.64k stars 1.44k forks source link

[Part segmentation] wrong concatenation with out5 #274

Open hyunjinku opened 3 years ago

hyunjinku commented 3 years ago

Dear authors and all,

I do want to discuss about the feature concatenation code for part segmentation, specifically, where concatenates local features and global feature in file pointnet/part_seg/pointnet_part_seg.py. According to the detail network architecture described in the supplement of main paper, feature dimension for cocatenation should be 3024 which is addition of following feature sizes: [64,128,128,128,512,2048,16] However, in line #86-122, you added variable out5(2048) instead of net_transformed(128) in concat, which makes dimension size 4944 by adding [64,128,128,512,2048,2048,16]. In sum, according to your paper, concatenation code [expand, out1,out2,out3,out4,out5] should be fixed to [expand, out1,out2,out3,net_transformed,out4] in my opinion. Can you tell me which version is correct for your experiment setting?

out1 = tf_util.conv2d(input_image, 64, [1,K], padding='VALID', stride=[1,1],
                     bn=True, is_training=is_training, scope='conv1', bn_decay=bn_decay)
out2 = tf_util.conv2d(out1, 128, [1,1], padding='VALID', stride=[1,1],
                     bn=True, is_training=is_training, scope='conv2', bn_decay=bn_decay)
out3 = tf_util.conv2d(out2, 128, [1,1], padding='VALID', stride=[1,1],
                     bn=True, is_training=is_training, scope='conv3', bn_decay=bn_decay)

with tf.variable_scope('transform_net2') as sc:
    K = 128
    transform = get_transform_K(out3, is_training, bn_decay, K)

end_points['transform'] = transform

squeezed_out3 = tf.reshape(out3, [batch_size, num_point, 128])
net_transformed = tf.matmul(squeezed_out3, transform)
net_transformed = tf.expand_dims(net_transformed, [2])

out4 = tf_util.conv2d(net_transformed, 512, [1,1], padding='VALID', stride=[1,1],
                     bn=True, is_training=is_training, scope='conv4', bn_decay=bn_decay)
out5 = tf_util.conv2d(out4, 2048, [1,1], padding='VALID', stride=[1,1],
                     bn=True, is_training=is_training, scope='conv5', bn_decay=bn_decay)
out_max = tf_util.max_pool2d(out5, [num_point,1], padding='VALID', scope='maxpool')
...
# segmentation network
one_hot_label_expand = tf.reshape(input_label, [batch_size, 1, 1, cat_num])
out_max = tf.concat(axis=3, values=[out_max, one_hot_label_expand])

expand = tf.tile(out_max, [1, num_point, 1, 1])
concat = tf.concat(axis=3, values=[expand, out1, out2, out3, out4, out5])