loicland / superpoint_graph

Large-scale Point Cloud Semantic Segmentation with Superpoint Graphs
MIT License
745 stars 214 forks source link

Problem of applying the SPG on other dataset! #48

Closed Jaiy closed 5 years ago

Jaiy commented 5 years ago

Hi, thanks a lot for your great work. Firstly, I can sucessfully run the code on the semantic3D dataset, now I want to apply the SPG on my own dataset. I have converted the dataset to the format of the semantic3D, the specifical file can be showed as below. Here, my labels number of the dataset is 5, I can successfully run the partition.py and avoid the .h5 files, but when I run the main.py for training, it failured as the picture below: error How can I handle this problem?Could you please give me some suggestions? Thanks a lot for your kind help! The data files and label files can be showed as below: data labels

loicland commented 5 years ago

Hi,

Are you trying to apply the network trained on sema3d (8 classes) directly to your data set (5 classes)? Or are you training from scratch?

Are your data annotated?

Jaiy commented 5 years ago

@loicland Hi, I'm not trying to apply the trained model on sema3D directly to my dataset, I'm trying to training from scratch. But I'm not sure whether this is right or not in this way: I just replace the origin semantic3D dataset to my dataset, I'm not using the custom_dataset way to train, because I make the format of my dataset same as the sema3D dataset. And I just modify the label numbers and categories in the scripts. I used the commnd like "CUDA_VISIBLE_DEVICES=0 python learning/main.py --dataset sema3d --SEMA3D_PATH ~/jwu/pytorch/semantic_3D --db_test_name testred --db_train_name trainval \ --epochs 500 --lr_steps '[350, 400, 450]' --test_nth_epoch 100 --model_config 'gru_10,f_5' --ptn_nfeat_stn 8 \ --nworkers 0 --odir "results/sema_3d/trainval_best"" Is that correct? I feel confused a lot. Or how should I modify the scripts or data prepared process? Thanks for your kind help! "Are your data annotated?"----Yes, I have made the data annotated. I have 5 categories: 1--'pedestrain', 2--'cyclist', 3--'tricycle', 4--''vehicle', 5--'other',and the '0' represent the 'unlabeled'.

loicland commented 5 years ago

I ses. You have to change the dbinfo line 65 of sema3d_dataset.py to adjust the number of classes.

Jaiy commented 5 years ago

@loicland When I modify the the label numbers and categories in the scripts, I also modify the line 65 of sema3d_dataset.py, but unfortunately it didn't work.

loicland commented 5 years ago

Hi,

did you change line 44 of /partition/partition.py as well?

Can you check if the labels field in the features folder have 5+1 or 8+1 columns? If so it means that the problem happens during the partition. Check the size of the labels variable line 144 of /partition/partition.py for example.

Jaiy commented 5 years ago

Thanks a lot for your kind help, after I modify the label numbers, it can run successfully. But when I trained the data, its training accuracy rapidly to 100% and the loss to 0, then I test the model, the predicted results have no label information. Then I checked the "superpoint_graphs/train/xx.h5", I found the components of sp_labels is 0(32-bit unsigned inter, 0), I doubt the label information isn't write to the .h5?I feel appreciate if you could give me some advice? And my data is very sparse, and the points of label is very sparse too, when apply the SPG to the sparse data, do it affect the results? And how should I modify the parameters in the code?

loicland commented 5 years ago

Hi,

You could try to use the vizualize function to see if your test data are correctly labelled. You should launched main.py with epochs -1 on your test dataset. There should be a h5 file in --odir with the superpoint inference results.

What do you mean by sparse? Sparse acquisition or sparse annotation?

Since your training set is small enough that the training reaches 100%, you should either:

Jaiy commented 5 years ago

Hi, thanks for your kind replying! The other day I checked the data, I found the data is a little error, then I modified the data and retrained the model, this time it can be trained successfully, and the accuracy was rising over time--after one epoch it raised to 98%, then after about 300 epochs it raised to 100%, and then I test the model, but the results was very pool.(The results can be list below) I wonder if I make errors in some steps or the problem of my own dataset.

loicland commented 5 years ago

Hi,

In our code the label 0 is used for unlabeled points, and treated in a special way (not taken into account in the loss for example, and not used to determine hard sp labels). You should do that too or most superpoints will have the 'unlabeled' class and produce the kind of prediction above.

How many 50k clouds do you have? If only a few then you should definitively downsize the networks as it seems that some serious overfitting is going on here. Remove state concatenation, divide the pointnets width by 2, use vecvec edge filters etc... see this issue, and let me know if it is still unclear to you how to proceed.

Are you satisfied with the partition? Our unsupervized partition step was not designed with velodyne 64 in mind, but maybe it does ok?

Can you check the labels of the superpoints in the superpoint_graphs folder? You should make sure that most of the superpoints contains labeled points. You will be able to tell by looking at the labels matrix. The first column is for the unlabelled points, so you should make sure that most lines have at least points in the other columns. If it is hard to tell you can always use the prediction2ply function with the argmax of the superpoints associated to each of their points (using thein_comp` variable, let me know if you cant figure out how).

BB88Lee commented 5 years ago

Hi, Meet similar problems by using SPG on sparse dataset. Each frame of my dataset has about 50k points, but only thousands are labeled. The first picture below is the predicted result (after 100 epoch training), each point has a predicted label, but the ground truth is shown below (2nd picture), the grey points are the unlabeled. But SPG segmented each unlabeled point in a class. I tried concatenation state removing ('gru_10_1_1_1_0'), vecvec filters and matrix edge filters, it seems nothing changed.

Is the result expected? I saw you have some special process for unlabeled data: _node_gt[node_gtsize[:,1:].sum(1)==0,:] = -100 # superpoints without labels are to be ignored in loss computation

Is that the reason? Could you give me some advice for result improving. Is that possible to regard unlabeled data (label "0") as a extended class (then my dataset has no unlabeled data anymore), and segment it in the same way like other normal classes?

res gt
loicland commented 5 years ago

Hi,

Yes, spg will try to attach a label to all superpoints. You could put all the unlabelled points in a "other" class (just add 1 to your label array).

As of now, the partition step is not designed for sparse acquisition. It is possible that the network learns how to combine the weirdly shaped superpoints though. Do you have a screenshot of the partition?

BB88Lee commented 5 years ago

Hi,

Ok, I will try it, thanks! To distinguish the small objects, I set the reg_strength as 0.1, and 5cm voxel grid. This is the partition result for this frame, is that satisfactory?

partition
loicland commented 5 years ago

It's not too bad. If it fails, try a slightly larger reg_strength maybe. Keep me tuned, I am very curious to see how it does on Velodyne.

A new supervized partition step will be realeased in a couple of month, able to overcome the problems of sparse scanning.