yangyanli / PointCNN

PointCNN: Convolution On X-Transformed Points (NeurIPS 2018)
https://arxiv.org/abs/1801.07791
Other
1.39k stars 366 forks source link

Model zoo hyperparameters ? #53

Closed andvidal closed 6 years ago

andvidal commented 6 years ago

Hi,

Thanks for sharing this great work!

I just need a small clarification on your code, specifically when you set the hyperparameters for your model.

Classification Problems

Looking at the ModelZoo for ModelNet40 (Figure 8-a) in the paper), I see: Paper (Model Zoo for ModelNet )

Input
X-Conv( N = 1024,  C = 32,   K = 8,   D = 1 )
X-Conv( N = 384,   C = 64,   K = 12,  D = 2 )
X-Conv( N = 128,   C = 128,  K = 16,  D = 2 )
X-Conv( N = 128,   C = 256,  K = 16,  D = 3 )
FC(256)
FC(128)
FC(40)

Code (settings for ModelNet ) (from : modelnet_x3_l4.py

x=3
xconv_param_name = ('K', 'D', 'P', 'C', 'links')
xconv_params = [dict(zip(xconv_param_name, xconv_param)) for xconv_param in
                [(8, 1, -1, 16 * x, []),
                 (12, 2, 384, 32 * x, []),
                 (16, 2, 128, 64 * x, []),
                 (16, 3, 128, 128 * x, [])]]

fc_param_name = ('C', 'dropout_rate')
fc_params = [dict(zip(fc_param_name, fc_param)) for fc_param in
             [(128 * x, 0.0),
              (64 * x, 0.8)]]

Looking at these two dictionaries, it all seems to be using almost all the values as shown in the paper,except, I don't understand the following parts:

  1. From xconv_params, there's a multiplication by 3 (x=3) for C. Why? I guess the question is: what is the meaning of this variable?

  2. Related to this x, where does 16, 32, 64, 128 (by this order) for C come from? It seems to be using half of what it described in the paper, for each of the layers, but each is multiplied by 3, so I'm a bit puzzled.

  3. From fc_params, the final FC layers have 128 and 64 units respectively (I know it is also then followed by a FC layer with 40 units , it's just not part of the settings explicitly). Again, this seems to be half of what's shown in the paper.

Segmentation Problems

Looking at the ModelZoo for ShapeNet Parts (Figure 8-e) in the paper), I see:

Input
#Downsampling
X-Conv( N = 2048,  C = 256,   K = 8,   D = 1 )  #Layer 0 
X-Conv( N = 765,   C = 256,   K = 12,  D = 2 )  #Layer 1
X-Conv( N = 384,   C = 512,   K = 16,  D = 2 )  #Layer 2

# connection
X-Conv( N = 128,   C = 1024,  K = 16,  D = 6 )  #Layer 3

#Upsampling
X-Conv( N = 384,   C = 512,   K = 16,  D = 6 )  #Connected to #2
X-Conv( N = 765,   C = 256,   K = 12,  D = 6 )  #Connected to #1
X-Conv( N = 2048,  C = 256,   K = 8,   D = 6 )  #Connected to #0
X-Conv( N = 2048,  C = 256,   K = 8,   D = 4 )  #Connected to #0

FC(256)
FC(256)
FC(#classes)

Code (from scannet_x8_2048_fps.py)

x = 8

xconv_param_name = ('K', 'D', 'P', 'C', 'links')
xconv_params = [dict(zip(xconv_param_name, xconv_param)) for xconv_param in
                [(8, 1, -1, 32 * x, []),      #0
                 (12, 2, 768, 32 * x, []),    #1
                 (16, 2, 384, 64 * x, []),    #2
                 (16, 6, 128, 128 * x, [])]]  #3

xdconv_param_name = ('K', 'D', 'pts_layer_idx', 'qrs_layer_idx')
xdconv_params = [dict(zip(xdconv_param_name, xdconv_param)) for xdconv_param in
                 [(16, 6, 3, 3),
                  (16, 6, 3, 2),
                  (12, 6, 2, 1),
                  (8, 6, 1, 0),
                  (8, 4, 0, 0)]]

fc_param_name = ('C', 'dropout_rate')
fc_params = [dict(zip(fc_param_name, fc_param)) for fc_param in
             [(32 * x, 0.0),
              (32 * x, 0.5)]]

Questions:

  1. Why is x set to 8 for segmentation problems? As opposed to ModelNet40, multiplying x*[32,32,64,128] yields the sames values as shown in the paper, which makes perfect sense to be, so is it maybe a coincidence? It also make sense for fc_params

  2. xconv_params makes sense to me, however xdconv_params seems to add an additional layer, resulting in 9 XConv (4+5) layers instead of 8, as shown in the paper. Is this related to 'pts_layer_idx', 'qrs_layer_idx' ?

Sorry for the long post, but I hope this will be beneficial to others as well :)

Many thanks in advance, and thanks once again for the great work!

yangyanli commented 6 years ago

Hi @andvidal ,

Many thanks for the post!

  1. The actual channel number of each layer is a C*x, where C is difference for each layer. The reason why we make it C*x, but not C, is that in this way, we can easily change x to tune different settings, rather than changing all Cs in all the layers.

  2. The parameters may be a bit different from the paper, as the code has been evolving frequently. However, the results of different parameter settings only differ a bit.

  3. The understanding of channel (parameter) number of the segmentation network involves the understanding of the 'pts_layer_idx' and 'qrs_layer_idx' settings, whose meanings are documented in the homepage ReadMe.

Hope this can answer your questions ;-) Please let us now if there is any further questions.

andvidal commented 6 years ago

Hi @yangyanli

Thank you so much for your quick reply! All clear now!