Open msy22 opened 6 years ago
The first two tiers of clouds can be created using Blender, which can export point clouds with vertex normals
So the idea is that we trust blender's normals which are also the result a computational step? Are these mesh models in which there's actual surfaces i.e., the normals can be accurately computed? I'm gonna assume so, since blender is not "point cloud domain".
The third tier is harder. I'm looking into a Unity-based simulation that could potentially simulate LiDAR scans in a known environment, and export them with the ground-truth normals. But if that falls through I have a lot of data from a survey-grade total station that I could calculate normals for and then decimate to make it look like a LiDAR scan.
This for me is a "nice to have" but not mandatory. I would rather have the benchmark for different solutions for our normals problem with just the first two tiers and after that, if you're still up for it, we can have a look at this third tier.
So the idea is that we trust blender's normals which are also the result a computational step?
You make a good point, and that's one of the reasons why I want to start with primitives whose normals can be visually inspected easily and are pretty obvious (e.g. normals of a flat plane are clearly wrong if they aren't 90 degrees). As for more complex shapes, I think at some level we have to trust someone/somethings normal calculations, since manually calculating and verifying each normal in a given point cloud isn't really practical. But your core point is still valid, and it would be worth looking up how Blender calculates it's normals (I assume from the mesh).
Are these mesh models in which there's actual surfaces i.e., the normals can be accurately computed? I'm gonna assume so, since blender is not "point cloud domain".
Yes, Blender seems to work with meshes as the primary medium, and when exporting point clouds it simply generated the points from mesh vertices.
This for me is a "nice to have" but not mandatory. I would rather have the benchmark for different solutions for our normals problem with just the first two tiers and after that, if you're still up for it, we can have a look at this third tier.
Unfortunately, one of the conditions of me getting to spend time solving this issue is that it contributes to a paper in some way (I'm a PhD student). And my primary motivation for doing all this is to improve the accuracy of scan registration for the LiDAR scans I get from my robot. If I can quantify the effect this issue has on real-world data, that will hopefully also help other PCL users with similar applications to mine.
In my opinion, the choice of test data is secondary. First we need to understand what we want to test.
Normal estimation is a pretty straightforward algorithm. For each point find its neighborhood, compute covariance, extract normal vector, done. Therefore, I only see three points that can be tested:
Number one is clearly outside of the scope, but the other two are legit goals. Now let's think what specifically we want to test in each case.
Input: bunch of points, output: eigenvector. Here we create a surface with known curvature, select a point, and sample its neighborhood. Things to test: robustness to perturbations in point coordinates, robustness to sparseness, robustness to the magnitude of coordinates (our original issue). Note that there is no search involved and no concept of scene/object! And we do not need blender here, test clouds can be created at test runtime.
Input: point cloud, output: cloud with normals. Here we are testing that the class as a whole works. That it does not segfault if the input point cloud is empty, that it allows to compute normals only for subset of points (by providing indices), that it computes some normals, etc, etc. Note that we do not need to be concerned with the quality of the normals here, because this was tested separately.
Now, after clearing this out, we can make more informed choice of data.
Now, after clearing this out, we can make more informed choice of data.
So you're essentially suggesting that we start by getting everything working with some totally artificial clouds, like the hard-coded ones in pcl/test/features/test_normal_estimation.cpp? And then once we've got that working, re-work the tests to function on proper point clouds?
As for the tests you're suggesting:
Would it be best just to focus on number 3 initially? since that answers the problem that started this all? This could be done pretty quickly and simply once I understand how the testing process works. This could be done as follows:
EXPECT_NEAR
assertion. As for understanding how all this works, I understand that the code which create the point clouds and runs the tests is written in ~/tests/test_mean_and_covariance.cpp
and that the specific functions we want to test are written in the ~/src/mean_and_covariance.hpp
tests. But what generates the output I see when running make tests
? I.e.:
test 1
Start 1: none
1: Test command: /home/matt/pcl-mean-and-covariance/build/test/test_mean_and_covariance_none
1: Test timeout computed to be: 10000000
1: [==========] Running 0 tests from 0 test cases.
1: [==========] 0 tests from 0 test cases ran. (0 ms total)
1: [ PASSED ] 0 tests.
1/4 Test #1: none ............................. Passed 0.00 sec
Is this a boilerplate template created by RUN_ALL_TESTS()
? or is that written somewhere in a file I haven't found yet?
Unfortunately, one of the conditions of me getting to spend time solving this issue is that it contributes to a paper in some way (I'm a PhD student). And my primary motivation for doing all this is to improve the accuracy of scan registration for the LiDAR scans I get from my robot. If I can quantify the effect this issue has on real-world data, that will hopefully also help other PCL users with similar applications to mine.
That is ok. I just want to make sure that the simple cases are covered first and that those are working properly.
Here we create a surface with known curvature, select a point, and sample its neighborhood.
Quick reminder that the definition of curvature is very context specific. See first paragraph of the wikipedia article. For the record, the notion of curvature in this case will be the related to the sum of the quadratic distance between all points and the hypothetical plane which better fits them. This is usually the lowest eigenvalue of the covariance matrix.
But what generates the output I see when running make tests?
It's CTest launching all google tests. The tests get registered onto it here. The target tests
is defined here.
That is ok. I just want to make sure that the simple cases are covered first and that those are working properly.
I absolutely agree, so I'll start with basic primitives, as I still think that's a better place to start than manually defining points.
It's CTest launching all google tests. The tests get registered onto it here. The target tests is defined here.
Ah ok, so all that stuff about none/native/sse2/no-sse2/ is all automated? And each test I specify in test_mean_and_covariance.cpp
will run through those four test blocks?
I'm still mucking about with the code, trying out a few test examples. Once I've got that working I'll start writing some proper basic tests, maybe just for basic normal calculation and then issue a PR for that so you guys can make sure I'm on the right track.
Would it be best just to focus on number 3 initially? since that answers the problem that started this all?
I would say let's focus exclusively on testing mean/covariance computation, without the normals part. Input: a set of points, output: mean and covariance matrix. In the test the sets of points are randomly generated, and there is a golden standard algorithm implemented for covariance computation. (I think two-pass double precision will suit.) Then the results produced by other approaches that we will implement can be compared to this "ground truth".
I would say let's focus exclusively on testing mean/covariance computation, without the normals part. Input: a set of points, output: mean and covariance matrix.
Ok, fair enough. That's something that's achievable in the next week, which we can build on. I'll re-focus on that. The only variations we have on computing the mean and covariance is the single-pass and double-pass algorithms right?
Also with and without "de-meaning" (in quotes because it's not exact). Plus, the algorithms are templated and thus can be instantiated in single or double precision.
I'm reading through the PCL code in the
pcl/test
folder, and I see that there are a number of example point clouds in there. I also see that there is an existing test for PCL point normals in the file test_normal_estimation.cpp but this file either specifies the exact points or loads one of the pre-existing point clouds:bun0.pcd
which appears to be one of the original scans of the Stanford Bunny.Specifying exact points doesn't scale well, and the Stanford Bunny seems like a limited example by itself, and doesn't contain any ground-truth normals. So I'd like to propose creating a folder of purpose-built point clouds that can be loaded and run through the test framework. I'm initially thinking of 15 or more clouds spread across three tiers:
In each case, the cloud will contain XYZ points and ground-truth normals. So rather than comparing the calculated normals to hard-coded values, you can load the cloud, copy it without normals, re-calculate them using the test functions, and then compare the calculated normals in the second (copied) cloud to the ground-truth normals in the first (original) cloud.
The first two tiers of clouds can be created using Blender, which can export point clouds with vertex normals, e.g.
The third tier is harder. I'm looking into a Unity-based simulation that could potentially simulate LiDAR scans in a known environment, and export them with the ground-truth normals. But if that falls through I have a lot of data from a survey-grade total station that I could calculate normals for and then decimate to make it look like a LiDAR scan.
What are everyone's thoughts on this? I'll also open another issue to discuss the types of tests I could implement once I've finished reading the googletest documentation.