HuguesTHOMAS / KPConv

Kernel Point Convolutions
MIT License
708 stars 130 forks source link

Preparing own data for regression with multiple outputs #97

Open nikogamulin opened 4 years ago

nikogamulin commented 4 years ago

Thank you for the code, @HuguesTHOMAS . In the scope of superquadrics research, in which the goal is to determine shape parameters, I tried to use the code you provided and slightly modified the architecture and prepared custom dataset.

A superquadric is represented as point cloud and the goal is to predict its parameters (dimensions along the axes - 3 values, offset - 3 values, and shape parameters - 2 values). In order to do that, I adapted the network in order to output 8-dimensional vector. As a loss function, I used MSE. Initially, I wanted to predict parameters for shapes which axis dimensions are in range [25, 75], offsets in range [25, 230], and shape parameters [0.1, 1]. Thus, the maximum possible distance from the origin of the coordinate system is 305 (230 + 75). The reason why I chose these values was to have a dataset comparable with the dataset from another paper. Anyway, when I tried to train the network on point clouds, parametrized by these parameter ranges, There was an out of memory exception (for training, I use NVIDIA RTX2080 Ti with 12GiB).

Then I tried to normalize the points and the errors were huge. KPConv-losses

If possible, I would appreciate it if you could comment or suggest any possible direction that might improve the results. First, I guess the network is suitable for such a task but I haven't reached a point to get to at least meaningful results and try to modify the network further thereafter. Second, I would be thankful if you share any experiences or memory consumption depending on body size estimates in order to adapt the data.

This is an example superquadric point cloud: point_cloud_sample

HuguesTHOMAS commented 4 years ago

Hi @nikogamulin,

Thank you for your interest in my work. This problem of yours seems very interesting and hard to tackle.

First I think that the success or failure of your training depends on the way you implemented your loss. The backbone architecture is the same for all tasks, but you need to have a good loss function to be able to learn anything. From what I see here, the loss value are huge, so it seems that there is a problem in its implementation. Are you sure the MSE function is doing what you want? I generally compare this kind of premade loss function to a custom implementation using torch.sum and torch.mean, just to be sure that the behavior is the one I want.

By the way I assume your graph are swapped, and the training ones are on the left (otherwise this would not make sense).

As for the memory consumption, I invite you to read some older issues that I already answered: 1, 2 In your case, the first_subsampling_dl and batch_num are the two parameters that will control the number of points (and therefore the memory) that your GPU consumes. As you have your own data, you also have to make sure that the point cloud that you feed are subsampled with the right grid (with cell size equal to first_subsampling_dl). Like I do in the function load_subsampled_clouds in ModelNet40.py.

I hope that helped. Good luck with your problem.

Best, Hugues