itailang / SampleNet

Differentiable Point Cloud Sampling (CVPR 2020 Oral)
https://arxiv.org/abs/1912.03663
Other
375 stars 40 forks source link

Train two SampleNet simultaneously #7

Closed NCTU-VRDL closed 4 years ago

NCTU-VRDL commented 4 years ago

Hi! Thanks for sharing the great work. I am wondering that if two SampleNet can be trained on the same point cloud simultaneously? Say I have one object point cloud with two shapes with very different features, the first SampleNet should only sample the points from shape1 and the second SampleNet should sample the points from shape2. And the task can be trained with some contrastive loss. Does it make any sense? I have tried a toy example but both SampleNet just samples the same points. Any comments are very welcome!

class FCN_sampler(nn.Module):
    def __init__(self, shape1_num_out_points=512, shape2_num_out_points=512):
        super(FCN_sampler, self).__init__() 
        self.sampler1 = SampleNet(
                    num_out_points=fg_num_out_points,
                    bottleneck_size=128,
                    group_size=8,
                    initial_temperature=1.0,
                    input_shape="bnc",
                    output_shape="bnc")  

        self.sampler2 = SampleNet(
                    num_out_points=bg_num_out_points,
                    bottleneck_size=128,
                    group_size=8,
                    initial_temperature=1.0,
                    input_shape="bnc",
                    output_shape="bnc") 

    def forward(self, x, shape1=True):
        if shape1:
            simp_pc, proj_pc = self.sampler1 (x)
        else:
            simp_pc, proj_pc = self.sampler2 (x)    
        return simp_pc, proj_pc

## Sample points
sampler = FCN_sampler()
simp_pc1, coord1 = sampler(coord)
simp_pc2, coord2 = sampler(coord, shape1=False)

# Compute losses
simplification_loss = sampler.sampler1.get_simplification_loss(
        coord, simp_pc1, 512
)
projection_loss = sampler.sampler1.get_projection_loss()
loss1 = 0.01 * simplification_loss + 0.01* projection_loss

simplification_loss = sampler.sampler2.get_simplification_loss(
        coord, simp_pc2, 512
)
projection_loss = sampler.sampler2.get_projection_loss()

loss2 = 0.01 * simplification_loss + 0.01* projection_loss

samplenet_loss = loss1 + loss2 
itailang commented 4 years ago

Hello NCTU-VRDL,

Thank you for your interest in our work!

Your scenario is very interesting. I have a suggestion: remove the third term in the simplify loss, namely (see equation 5 in the paper). This term spreads the sample over the entire point cloud, which you do not want in your case. If the task needs points from either shape1 or shape2, the relevant SampleNet might understand it and sample accordingly. Another thing I would try is lower the weight of the projection loss, to give the network some more time to explore the input point cloud before converging.

I am currently closing the issue. You are welcome to share your findings or further questions if any. Good Luck!

NCTU-VRDL commented 4 years ago

Hi @itailang! I am very grateful for your useful suggestion. Actually, I am trying to take the SampleNet as a segmentation model, which can segment a particular shape based on its distinguishing features from other shapes. I notice that there is a complete_fps argument during inference. I set it to False to let my sampler output arbitrary points since the points per shape are not the same in each sample. Is that make any sense?

Also, the sampled points should totally different (mutual exclusive) between the two samplers. However, the designed loss is not to ensure this condition. I have added a same point loss to maximize the distance of sampled points from both samplers. Do you have any suggestions or comments? Thanks so much for your help! :)

simp_pc, shape1_coord = sampler(coord)
simp_pc, shape2_coord = sampler(coord, shape1=False)
same_point_loss1, same_point_loss2 = ChamferDistance()(shape1_coord , shape2_coord)
itailang commented 4 years ago

It makes perfect sense, as you do not need a specific sample size for your case. Note that your task model needs to distinguish the desired shape features from the other shapes, to hint SampleNet about it. Like a classifier model knows how to classify, and "teaches" SampleNet to sample for classification.

You are correct, an additional loss is needed. I suggest to use a repulsion loss, to separate the points from each sampler. Something like:

repulsion_loss1 = max(th1 - same_point_loss1, 0)
repulsion_loss2 = max(th2 - same_point_loss2, 0)

where th1 and th2 are thresholds that influence the amount of separation between points.

NCTU-VRDL commented 4 years ago

Hi @itailang! Thanks so much for your suggestion. My model is now successfully trained and got good results. But a small issue is that my sampled points are too sparse for segmentation. I have lowered the group_size to 3, but the results still sparse. There are 1024 points in one sample so I set the num_out_points to 1024. Using the complete_fps=False, I only get 163 points for the particular shape (ground truth are 761 points). I know that there should be some post-processing, such as label propagation. But are there any other suggestions for hyper-parameters in SampleNet?

Another issue is that I found my simplified points are very different from the original points when inference. The simplification_loss is about 2 after 500 epochs training. I have checked the sim_pc during training and it looks well. Not sure if I did something wrong for inference. Really appreciate your help :)

itailang commented 4 years ago

Glad I can help!

How about adding a repulsion loss between sampled points? You can use the knn mechanism from the projection module to find neighbors in the sampled point cloud. Then, apply the loss on the average neighbor distance or the distance to each neighbor (similar to the repulsion loss between sampled points from different objects).

To verify your inference procedure, run it on an example from the training set, and see of you get a result you are satisfied with, like you got in the training procedure.

NCTU-VRDL commented 4 years ago

Hi! Thanks again for your informative reply. I am not pretty sure if I implement the repulsion loss correctly. I would be grateful if you can have a check on it. Thanks for being such helpful with my questions.

    def repulsion_loss(self, point_cloud, query_cloud, threshold=0.5):
        grouped_points, grouped_features = self._group_points(
            point_cloud, query_cloud, None
        )
        dist = self._get_distances(grouped_points, query_cloud)
        repulsion_loss = torch.mean(torch.max(threshold - dist, torch.tensor(0.0).cuda())[0])

        return repulsion_loss
itailang commented 4 years ago

My pleasure :)

You should do a small modification to the code. Since in your case the point_cloud and query_cloud are the same (they are both shape1_coord for example), you should omit the first distance, which is 0. In addition, do not divide the distances by sigma. So, modify _get_distances as follows:

def _get_distances(self, grouped_points, query_cloud, use_sigma=False, dist_start_idx=1):
    deltas = grouped_points - query_cloud.unsqueeze(-1).expand_as(grouped_points)
    if use_sigma:    
        dist = torch.sum(deltas ** 2, dim=_axis_to_dim(3), keepdim=True) / self.sigma()
    else:
        dist = torch.sum(deltas ** 2, dim=_axis_to_dim(3), keepdim=True)
    dist = dist[:, :, :, dist_start_idx:]
    return dist

Note that your implementation applies the loss per neighbor distance. A softer version is to apply it on the average neighbor distance. In this case:

repulsion_loss = torch.mean(torch.max(threshold - torch.mean(dist, dim=3, torch.tensor(0.0).cuda())[0])

Try both versions and see what works for you.

In addition, a threshold of 0.5 seems very high to me. I would check the average distance between points for the shape you want to segment, and set the threshold accordingly.

Good luck!

NCTU-VRDL commented 4 years ago

Hi @itailang!

Thanks for your kind reminder. I want to make sure which points should I pass to the repulsion loss we discussed above? Should I pass the points before soft projected (the output of PoinetNet)? or the points after soft projection? I am using the latter right now.

simp_pc1, shape1_points = sampler1(points)
simp_pc1, shape2_points = sampler2(points) # The points already passed soft projection
sampler1.get_repulsion_loss(shape1_points, shape1_points) # we want the shape1 point get further between each other

Another question is that I find my simplified points do not form any shape but more like noises (It should be a chair) after 500 epochs training. Does that mean the sampler is not learning well? image image

itailang commented 4 years ago

The points after soft projection, as you do.

The simplified points should resemble the shape instead of looking like noise. I suggest to do an experiment where you turn off the soft projection (use the simplified points as if they were the softly projected points). Calibrate your system to get satisfying simplified points and then turn the soft projection back on.

NCTU-VRDL commented 4 years ago

Hi @itailang! Hope you are doing well! Finally, I get promising results by increasing the ALPHA for simplifications loss. I notice that the value of alpha is so different between classification tasks and registration tasks. Is there any tip for tuning this hyperparameter? I have really appreciated your all suggestions on this thread :)

itailang commented 4 years ago

Great! Glad I could help!

The value is different, since the tasks are different in nature. The first is a decision task and the second is a regression task.

My tip is what I wrote in my previous comment: first calibrate the hyper-parameters of the simplification loss (without projection loss) to achive a satisfying task performance. Then add the projection loss.

Good luck with your research!