Tips on improving performance on custom dataset

loicland / superpoint_graph

Large-scale Point Cloud Semantic Segmentation with Superpoint Graphs

MIT License

758 stars 214 forks source link

Tips on improving performance on custom dataset #42

Closed DrFridolin closed 5 years ago

DrFridolin commented 6 years ago

Hi Loic,

I've been working on segmenting our own colorized point cloud data using your superpoint graph approach. Since annotating some of our own data will be time consuming (we will eventually do it), I first wanted to see how far I could get with the model pretrained on Semantic3D dataset. As of now, I'm able to get reasonable segmentation results, but there is still room for improvement on the accuracy. I just wanted to get your suggestions on whether any further improvement could be possible with quick fixes.

I went through all the relevant previous issues raised here on github to avoid wasting your time with tips you've already given others. Below is a list of changes I made on our data to make best possible use of Semantic3D model:

Made sure the point clouds are sufficiently dense and that the density levels match Semantic3D data. Our scenes are much larger so I initially had to make them sparse enough for segmentation as a whole. Those results were not very good, so instead I've tiled the dense data and I'm processing it piece by piece. There are around 16 million points per input subset.
Manually cleaned up outlier points below the ground level. Our data is obtained by a structure-from-motion algorithm on image sequences collected on a drone, so the nature of the outlier points are different from LIDAR collects. The performance was noticeably improved on the clean point clouds.
Made sure that z axis points up in our dataset, that the units are in meters and building heights are not drastically different from those in Semantic3D.
Tried offsetting the z values so that the ground level is around 0 and all z values in each point cloud are positive. This improved performance in certain locations, and made it worse in others (I have some examples for both).
Made sure our data is in the same format as Semantic3D data, and the changes I made to the code mostly consists of fixing paths and filenames. I'm using the same features, same number of classes, same voxel size and reg strength, same colorization scheme etc.
Looked at the _geof files to verify, to the best of my ability, that the color-coded geometric features made sense compared to those in Semantic3D data. Also verified that superpoint partitioning looks reasonable.

Most noticeable issues in the results we get:

Incorrect labeling of roof points: The flat roofs of buildings are either labeled as man-made terrain or natural terrain. Is this because the Semantic3D training data does not feature flat roofs extensively? Some of the scenes have the roof segments visible for buildings, so I was expecting this to work fine.
Trees are labeled as natural terrain with original z values, and as building with offset z values. Is this because the distribution of points around trees are different for SFM vs LIDAR?
Vehicles labeled as hardscape with original z values, and partially labeled as building with offset z values.
I am not sure if I understand the impact of height normalization that you talked about in some of the previous posts. As far as I'm aware, I did not change anything in that regard but I'm not sure if I need to.

loicland commented 6 years ago

Hi,

Looking at your first illustrations, a couple of thoughts:

the partition seems to be failing, with one enormous superpoint overlapping several classes, trees and buildings. Try with a lower regularization strength. If the partition fails, SPG ahs no way of recovering. It will be better to have too small superpoints than this kind of undersegmentation errors.
16 millions post pruning points should be fine
the ground should be more or less at zero, and z in meters, as you mentioned.
your photogrametric informations are very clean, you could use it in the partition step as well (for fine tuning)
_geof looks good
superpoint partitioning is NOT reasonable. Superpoints should be smaller
errors on roof could be linked to their shape, which cant be found on semantic3d (which has slanted, tiled roofs)
trees acquisition are denser than on semantic3d, and also less volumetric as far as I can say. So the gap in performance could be linked to SFM vs LIDAR differences indeed. Would be fixed by fine tuning most likely.
the classification of cars depend a lot on their position in the global scene. Since your scenes are so different frop semantic3D, I do not expect too good a classification rate before fine tuning.
impact of height normalization: check if your preprocessing runs a similar normalization than line 91 in learning/sema3d_dataset.py, which normalized the elevation.

I am now donwloading your .ply to check if I can get a better partition / semantic segmentation. I am sure the results can be improved somewhat, but do not expect great results before fine tuning: the data sets AND the acquisition methods are different, it is a lot to ask for an overparameterized model!

DrFridolin commented 6 years ago

Thank you for your time and all the tips. Looking forward to your comments when you get a chance to see the ply files.

Some immediate questions:

The only problematic superpoint I can see is the giant light blue one covering the building and some of the trees - would you agree? I'll try lower regularization strength as you suggested.
16 million points is the raw input, is that too little or too much? As far as I understand the pruning step voxelizes the input data to a specified level of density (voxel_width argument in partition/partition.py) is this correct?
Regarding cars: Is the problem how different the surrounding context is (open parking lot, vs cars parked on the streets) or differences in elevation? Offsetting the z coordinates to all-positive values allows detection of some cars, but others are still marked as buildings. In Semantic 3D, ground is slightly below zero level, I am wondering if this could be the culprit?
I understand that the key takeaway is that we will need to fine tune using our own annotated data. Would you recommend just fine tuning the existing Semantic3D model or train a new one from scratch, given how different scene characteristics are?

loicland commented 6 years ago

Answers to your questins in order:

The problem is not only the undersegmentation, but that the superpoints should be similar in size than on semantic3d. Since the acquisition is so different (SFM vs LiDAR) this may require some trial and error to find a fitting regularization strength if you want to be able to use pretrained weights. Were the weight trained from scratch it wouldn't be as critical (barring oversegmentation errors)
16 million points is fine if your machine can handle it! The first step of the algorithm is to prune the data indeed, using a 5cm grid for semantic3d.
it is both. adjusting ground level should help a bit, but in smeantic3d it is not very consistent from one scan to another neither. The structure of the scene is a bit different indeed, meaning that the surrounding of the cars superpoints in the SPG are different. Only fine tuning can help here.
depending on how much data you are ready to annotate you may or may not use fine tuning. If you are happy with the semantic class breakdown of semantic3d, I would recommend to just fine tune.

If you really don't have a lot of annotated data, I can recommend the following approach:

[optional] retrain the model from scratch on semantic3d without using state concatenation (to decrease the size of the last layer which goes from embedding to logits). This is obtained with the following:
```
--model_config 'gru_10_1_1_1_0,f_8'
```
freeze all layers except the last one (embeding-> logits, defined line 47 in learning/graphnet.py) and fine tune it using your data.

This will help the network adapt to your data while keeping the hard-earned feature extraction layers. This will not be perfect (especially since your data is so different from semantic3d) but could help, as a first step.

DrFridolin commented 6 years ago

Hi Loic,

I hope you don't mind keeping this thread open so we can bounce ideas during this process.

I have decreased the reg_strength all the way down to 0.3 and added color features to the partitioning step. The results of the partitioning improved significantly. The predictions still need to be better, so we also started annotating our point cloud data and working towards an annotated training set of comparable size to Semantic3D.

I will post more once we are in the process of training your network from scratch. Our format and scenes are similar to Semantic3D scenes (although denser, and more surface-like compared to LIDAR points, which scatter more). Our data was collected with a drone, so the z values are also different, but can be easily offset. We are targeting the same 8 categories as Semantic3D as well. So I was planning on keeping your training instructions for Semantic3D almost unchanged, aside from lower reg_strength and adding color to partitioning step (I think PointNet embeddings ingest color by default so it looks like no changes will be necessary there). Now that you got a chance to see what our data looks like, do you have any other suggestions regarding what we should change during the training process? We are planning to train from scratch, rather than fine tuning.

Also, is there a way for your code to output the superpoint partition IDs for each point, in a similar way to how you output predicted labels for each point? Currently I am trying to run partition/visualize with upsample 1, then I parse the colors in the ply file, but it is a bit cumbersome to do that.

Thanks for all the help.

loicland commented 6 years ago

Hi,

Always happy to hear about people using the code, and helping if I can. If your data is similar to semantic3d but cleaner then it should work fine. Do not worry too much about the z value if you are training from scratch, but do make sure that it is consistent from one point cloud to another.

Something that you should try is if matrix-vector (model_config gru10) or vector-vector (model_config gru10_0) filters are better. If your data set is large enough m-v is more expressive, but tends to overfit more. Also if you want to test out the framework before your dataset is complete in size, you can decrease the pointnets layers size without harming performance too much and steeply decreasing the number of parameters.

I think PointNet embeddings ingest color by default

Correct

Also, is there a way for your code to output the superpoint partition IDs for each point, in a similar way to how you output predicted labels for each point? Currently I am trying to run partition/visualize with upsample 1, then I parse the colors in the ply file, but it is a bit cumbersome to do that.

So you want to output a ply with a scalar field corresponding to the partition id (and not a random color for each partition), upsampled to your original point clouds pre-pruning do I understand correctly?

If you are satisfied with just associating each point with the partition of its closest neighbor in the pruned cloud, you could just use the interpolate_labels function with ids as labels. Note however that currently in the visualize function --upsample only applies to the prediction, you would need to change that. It should be quite straightforward however.

And if you want to output the id as a scalar field instead of a random color, you can use this function to paste in partition/provider (which will make it in an upcoming commit):

def scalar2ply(filename, xyz, scalar):
    """write a ply with an unisgned integer scalar field"""
    prop = [('x', 'f4'), ('y', 'f4'), ('z', 'f4'), ('scalar', 'u4')]
    vertex_all = np.empty(len(xyz), dtype=prop)
    for i in range(0, 3):
        vertex_all[prop[i][0]] = xyz[:, i]
    vertex_all[prop[3][0]] = scalar
    ply = PlyData([PlyElement.describe(vertex_all, 'vertex')], text=True)

    ply.write(filename)

DrFridolin commented 6 years ago

Thanks for the tips. Quick question: Where can I reduce the size of PointNet layers?

loicland commented 6 years ago

Hi,

You can set all parameters with options described in learning/main.py.

In particular check --ptn_widths (size of the mlps in pointnet, check the default value for format : before and after the max pool).

You can also decrease --ptn_widths_stn (the input transformers of pointnet), and fnet_widths (edge filters network )

DrFridolin commented 6 years ago

Hello Loic,

We are currently in the process of running superpoint partitioning on a small batch of our annotated training set, in preparation for re-training the model.

For most scenes, the partitioning step runs OK, but for one of the tiles, it seems to be returning a single component, and then crashing because there are no edges in the graph.

The specific error message is copy-pasted below. I have also attached screenshots of the scene, which has a weird geometry because it is far away from the data collecting drone. Did you run into this issue before, or do you have any guesses as to what might be going wrong?

8 / 8---> Tile_station008009_xyz_intensity_rgb reading the existing feature file... computing the superpoint graph... minimal partition... L0-CUT PURSUIT WITH L2 FIDELITY PARAMETERIZATION = FAST Graph 41666150 vertices and 999987552 edges and observation of dimension 7 Iteration 1 - 1 components - Saturation 100.0 % - Quadratic Energy 601.325 % - Timer 692.367 All components are saturated computation of the SPG... Traceback (most recent call last): File "partition/partition.py", line 200, in graph_sp = compute_sp_graph(xyz, args.d_se_max, in_component, components, labels, n_labels) File "../repositories/SuperPointGraphs/partition/graphs.py", line 81, in compute_sp_graph edges = np.unique(edges, axis=1) File "../anaconda3/lib/python3.6/site-packages/numpy/lib/arraysetops.py", line 230, in unique ar = ar.reshape(orig_shape[0], -1) ValueError: cannot reshape array of size 0 into shape (0,newaxis)

problem_tile_01 problem_tile_02

loicland commented 6 years ago

Tthat is very weird, never encountered that before. The fact that 1 components gives 601.325 % tells me something is going wrong with cut pursuit.

Again, we design the partition step with LiDAR in mind en not SLAM. We will release soon (few weeks) a new faster, multithreaded version of cut pursuit, which sharply increase the speed at which you can run the partition. In a couple month we will release as well a fully learnable partition step, which would allow your model to learn SLAM-specific features and provide better partition.

In the meantime, this seems like a bug. Could you :

display the color-coded geometric features (vizualize.py with --output_type f) and post an image here?
try to reproduce the bug on a smaller cloud by only selecting a small area? To find a minimal failing example.

DrFridolin commented 6 years ago

Hi Loic,

The updates you announced are exciting, currently partitioning seems to be the bottleneck, so we are looking forward to having the faster/multi-threaded/learnable version :)

I'd be happy to help you debug this issue. I already tried to write out color-coded features, but this crash happens before anything is written into superpoint_graphs folder so partition/visualize cannot write anything for this scene. Let me know if there is a workaround for this, because the feature file is actually written.

I will try cropping different areas from this point cloud and re-run the partition code. I can also send you the entire point cloud if you'd like to take a look at it yourself.

In case it's relevant, I've added color channels as features to the partitioning step, but I'm otherwise following the Semantic3D conventions.

loicland commented 6 years ago

if the features are computed, you can just call

geof2ply('my_test_file.ply', xyz, geof)

to write the feature file.

Your file seems quite heavy, let's hope it doesn't come to you sending it to me! But if need be we'll do that.

yaping222 commented 5 years ago

Hello Loic,

We are currently in the process of running superpoint partitioning on a small batch of our annotated training set, in preparation for re-training the model.

For most scenes, the partitioning step runs OK, but for one of the tiles, it seems to be returning a single component, and then crashing because there are no edges in the graph.

The specific error message is copy-pasted below. I have also attached screenshots of the scene, which has a weird geometry because it is far away from the data collecting drone. Did you run into this issue before, or do you have any guesses as to what might be going wrong?

8 / 8---> Tile_station008009_xyz_intensity_rgb reading the existing feature file... computing the superpoint graph... minimal partition... L0-CUT PURSUIT WITH L2 FIDELITY PARAMETERIZATION = FAST Graph 41666150 vertices and 999987552 edges and observation of dimension 7 Iteration 1 - 1 components - Saturation 100.0 % - Quadratic Energy 601.325 % - Timer 692.367 All components are saturated computation of the SPG... Traceback (most recent call last): File "partition/partition.py", line 200, in graph_sp = compute_sp_graph(xyz, args.d_se_max, in_component, components, labels, n_labels) File "../repositories/SuperPointGraphs/partition/graphs.py", line 81, in compute_sp_graph edges = np.unique(edges, axis=1) File "../anaconda3/lib/python3.6/site-packages/numpy/lib/arraysetops.py", line 230, in unique ar = ar.reshape(orig_shape[0], -1) ValueError: cannot reshape array of size 0 into shape (0,newaxis)

I met the same issue. I'm using airborne laser scanning data over a very large area, so I cropped the dataset into blocks. The partition crashed when the block only contains ground points without any other objects. Actually, it is reasonable to have one super-point but the codes failed to build the superpoint graph. Looking forward to your updated version :)

Here is the error message: reading the existing feature file... computing the superpoint graph... minimal partition... L0-CUT PURSUIT WITH L2 FIDELITY PARAMETERIZATION = FAST Graph 416617 vertices and 9998760 edges and observation of dimension 4 Iteration 1 - 1 components - Saturation 100.0 % - Quadratic Energy 89.437 % - Timer 37.5906 All components are saturated computation of the SPG... Traceback (most recent call last): File "partition/partition.py", line 216, in graph_sp = compute_sp_graph(xyz, args.d_se_max, in_component, components, labels, n_labels) File "/home/user/Documents/yaping/superpoint_graph-release/partition/graphs.py", line 82, in compute_sp_graph edges = np.unique(edges, axis=1) File "/home/user/anaconda3/envs/myenv/lib/python3.6/site-packages/numpy/lib/arraysetops.py", line 245, in unique ar = ar.reshape(orig_shape[0], -1) ValueError: cannot reshape array of size 0 into shape (0,newaxis)

loicland commented 5 years ago

Hi,

it seems like SPG crashes when there is only one superpoint. This could be avoided by detecting when such occurence arrive and completely bypass the contextual segmentation. I can help you with that.

But first, I want to make sure that the fact that there is only one superpoint is on purpose. Could you post the following:

the scene in question
a visualization of the geometric feature on the scene (option f in visualize.py)
an excrepts of the corresponding features file
the results when trying to partition with a much smaller regularization strength Also can you make sure that there is no Inf or NaN in the feature files as well?

yaping222 commented 5 years ago

Hi Loic, Thank you very much for your answer. The scene is in a rural area. Here is the feature screenshot

Here is an excrepts of the corresponding features file [[0.22490238 0.7597524 0.01534523 0.01008638] [0.2733335 0.7114006 0.0152659 0.00843619] [0.27333382 0.7114003 0.01526591 0.00843619] [0.20107405 0.7833601 0.01556581 0.0104016 ] [0.14636141 0.8379407 0.01569791 0.01160373] [0.2707863 0.7139063 0.01530742 0.00557747] [0.31228936 0.67303705 0.01467359 0.0048687 ] [0.24014749 0.7442654 0.01558712 0.00814037] [0.3344242 0.6507055 0.0148703 0.00812188] [0.3840217 0.6007434 0.01523489 0.00656607] [0.2304871 0.75469035 0.01482255 0.01172773] [0.23048721 0.7546903 0.01482253 0.01172775] [0.11362143 0.87059796 0.01578066 0.01244285] [0.12132972 0.86253 0.01614034 0.01190102] [0.25391194 0.7315044 0.01458369 0.01286976] [0.16627994 0.81816655 0.0155535 0.01080751] [0.10519965 0.8789818 0.01581849 0.01142147] [0.133542 0.8504367 0.01602132 0.01051087] [0.07029666 0.9137081 0.01599527 0.01088359] [0.04670327 0.93577826 0.01751849 0.01086452]]

And there is no Inf or NaN in the feature values.

loicland commented 5 years ago

Right, so you do want to have only one superpoint. I did not anticipate this need, and this might indeed break SPG at different points. I don't have time to adapt the code right now, but if you are I can help you.

Basically you need to add a special case for when the number of edges is zero everywhere it creates a bug. You can look line 214 of /partition/graphs.py to see how I dealt with the special case of superpoints with size one or two.

Sorry for the not-so-satisfying answer! I will try to fix this in the next version coming in a couple of month.

Alternatively a dirty fix that would work if you are in a hurry is that you could simply detect when there is only one component line 177 of partition.py and put a random point in it's own component. If you are going this route I can help you a well, it should only be a couple of lines.

yaping222 commented 5 years ago

Thank you very much for your suggestions! I will try them.