loicland / superpoint_graph

Large-scale Point Cloud Semantic Segmentation with Superpoint Graphs
MIT License
755 stars 214 forks source link

Some Questions about superpoint_graph #168

Closed XuHan-CN closed 3 years ago

XuHan-CN commented 4 years ago

The following is the GraphConvModule

(ecc): GraphNetwork( (0): RNNGraphConvModule( (_cell): GRUCellEx( 32, 32 (ini): InstanceNorm1d(1, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False) (inh): InstanceNorm1d(1, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False) (ig): Linear(in_features=32, out_features=32, bias=True) )(ingate layernorm) (_fnet): Sequential( (0): Linear(in_features=13, out_features=32, bias=True) (1): ReLU(inplace) (2): Linear(in_features=32, out_features=128, bias=True) (3): ReLU(inplace) (4): Linear(in_features=128, out_features=64, bias=True) (5): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (6): ReLU(inplace) (7): Linear(in_features=64, out_features=1024, bias=False) ) ) (1): Linear(in_features=352, out_features=13, bias=True) )

In GRU, the output dimension of Linear is 32, but in sequential, the input dimension of the first Linear is 13. Why are the two connected subnetworks, the former output dimension and the latter input dimension not equal?

loicland commented 4 years ago

The MLP 13->32->128->64->1024 is the superedge filters-generating network. It creates a matrix F_ij 32x32 which encapsulate the nature of the adjacency between two neighboring superpoints. 13 is the number of superedge handcrafted features.

In ECC, the message sent by superpoint i to superpoint j is the hidden state h_i^t (size 32) multiplied by the corresponding superedge filters: F_ij x h_i (size 32)

Once all messages are sent, the incoming messages are averaged and this value (size 32) serves as input to the GRU to update the hidden state: h_i^{t+1} = GRU(hi^t, sum{j in N_i} F_ij h_j^t )

Finally, the hidden states are concatenate over the 10 message passing iteration (+ the initial one) into a 352=11×32 vector, to create class logits for the superpoint (13 classes here). Does this clear things up?