msy22 commented 6 years ago

Your Environment

Operating System and version: Ubuntu 14.04
Compiler: GCC v4.8.4
PCL Version: 1.8.1
C++ Version: C++11

Context

I'm currently trying to find a point cloud registration method that can accurately match relatively sparse 3D point clouds from one Velodyne HDL-32E. At the moment I am using PCL's implementation of Generalized ICP to match several sequential scans similar to the one shown below.

cloud1

(This is one scan of a carpark outside an office building with floor-ceiling glass windows, hence little few LiDAR returns from the wall).

I.e. I take one scan like this and try to register it to a scan taken 1-2 cm further forward. Now, matching sequential scans does work in some areas but not in others. Since GICP depends on good normal calculations I've been inspecting some clouds in the PCL visualizer (shown in this tutorial) and I noticed that in some places the normals are not being calculated correctly at all.

Expected Behavior

The normals in most parts of the clouds should be calculated correctly. Or if they aren't, it's obviously because the data is too sparse in that region of the point cloud.

Current Behavior

The normal directions are mostly correct, but in some places where there is plenty of data, they are not calculated correctly. The most obvious place is in this example, where the normals all point more-or-less upwards in the flower bed part of the wall. This is incorrect, the normals should be orthogonal to the wall and pointing out horozontally:

normal_wall_example2

This is the case in this specific area pretty much regardless of whether I use a KNearest or Radius point search, and regardless of the values for K or for the radius.

Code to Reproduce

Try the cloud in question: carpark_cloud.ply

in the PCL visualizer (shown in this tutorial)

Possible Solution

I have two theories on what the cause of the problem is.

The data itself is the problem. Because the Velodyne has an error of +/- 2cm any given scan line is not a perfect line but has a thickness, whose maximum width will be about 4 cm (assuming max +/- 2cm error). This means that each scan line actually generates it's own plane of points, and at close ranges there may be enough points packed tightly together in one line that this overrules the plane fitting method PCL uses by default and fits a plane to the scan line rather than the underlying topography. Hopefully this diagram accurately illustrates the problem:

lidar_issue

There is a problem or bug with PCL's plane fitting code. This was suggested to me by a colleague who suggested that because they're all pointing upwards, it might be because that is the default and if the plane fitting functions fail, it won't produce anything different. Now, I've tracked the normal calculation code down to pcl::eigen33, where the smallest eigenvalue is found, and it's associated eigenvector is set to the nx, ny and nz normal coordiantes. But I haven't gotten any further than that. yet. This theory is the reason I'm raising the issue here and not as a general issue on the mailing list.

So my question is: Which of these two theories is likely to be correct, and are there any suggestions on what I can do to resolve it?

SergioRAgostinho commented 6 years ago

Hey. Here's a couple of comments/ideas.

This is incorrect, the normals should be orthogonal to the wall and pointing out horozontally

The points inside the highlighted region appear to be distributed as lines. There might be a chance that during normal estimation, your neighborhood search parameters only allow it to establish a neighborhood which is approximately a line, yielding unreliable normals. Can you randomly select a point used in the region you selected and highlight the points that are being used to estimate the normals?

There is a problem or bug with PCL's plane fitting code. This was suggested to me by a colleague who suggested that because they're all pointing upwards, it might be because that is the default and if the plane fitting functions fail, it won't produce anything different. Now, I've tracked the normal calculation code down to pcl::eigen33, where the smallest eigenvalue is found, and it's associated eigenvector is set to the nx, ny and nz normal coordiantes. But I haven't gotten any further than that. yet. This theory is the reason I'm raising the issue here and not as a general issue on the mailing list.

There's a long lasting known problem caused by precision issues. You can read more about it in #560 and #1407. I also believed we also pushed a couple of PRs to master which affect covariance and eigenvalue/vector computation. You should definitely try the master branch if you can, just to see if you verify a different behavior.

Give it a try and report results. :shipit:

msy22 commented 6 years ago

Thanks very much for the help and the suggestions Serigo! I really appreciate it.

First off, the point selection - this has been something I've worried about. So I did spend some time modifying some of my code to return the selected indices and colour them, which for this specific cloud yields:

point_selection_comparison

And the result is more or less the same, regardless of the input parameters. And in the picture with the normals shown (the one where the cloud is green instead of white) I used a search radius set to 0.3 m, which as you can see from the image just above, there should be more than enough points across multiple lines to produce an accurate normal.

msy22 commented 6 years ago

As for the second suggestion...

Issue #560 talks about the issue and at the bottom, gcasey says that PR #1407 should fix it. But it looks like that PR hasn't been merged with the main branch? And appears to be failing a build check?

It also looks like both you and Logan Byers proposed solutions to the problem in the comments of PR #1407, and that in Logan's case he found the two-pass solution wasn't robust/accurate enough for him. So I'll try his solution and see if I can get it to work...

For the record I did (re-) install PCL 1.8.1 from source (the main trunk) less than two months ago, so it is relatively new.

SergioRAgostinho commented 6 years ago

I used a search radius set to 0.3 m, which as you can see from the image just above, there should be more than enough points across multiple lines to produce an accurate normal.

That's our baseline then and I agree that that should work to produce a decent normal.

Issue #560 talks about the issue and at the bottom, gcasey says that PR #1407 should fix it. But it looks like that PR hasn't been merged with the main branch? And appears to be failing a build check?

It also looks like both you and Logan Byers proposed solutions to the problem in the comments of PR #1407, and that in Logan's case he found the two-pass solution wasn't robust/accurate enough for him. So I'll try his solution and see if I can get it to work...

You should try to compute things manually now for a single point. The normal is the eigen vector associated with the lowest eigenvalue of the covariance matrix. The most robust version of this algorithm I can think is resorting to the following steps, performing them separately with double precision:

compute the centroid
demean the points
compute covariance
compute eigenvalues/eigenvectors
retain the eigenvector corresponding to the lowest eigenvalue

Try to reproduce these steps with the highest precision available and validate the results with other libraries whenever possible e.g., once you have the covariance matrix (3x3) you can pass see what numpy outputs as eigenvalues/eigenvectors vs what the library is doing. Try to validate as much steps as possible.

If you reach a stable setup for the normals then report back and we'll start digging when are things failing in the normal estimation algorithm.

msy22 commented 6 years ago

OK, so I kinda raced ahead and went for a full blown solution. I worked backwards down the chain of functions and figured out which ones are used during the normal calculation process, copied them into a modified version of normal_3d_omp.h/hpp and then just changed all the relevant instances of "float" to double. The result is REALLY hacky but it works:

corrected_normals

This looks a lot better than what I was getting before, and produces normals that look exactly as I would expect. This is using a search radius of 0.3m

msy22 commented 6 years ago

I can go through and try to validate each stage individually like you suggested, but I think the root cause is definitely the use of floats rather than doubles. So is there value in trying to verify each stage in the process or would it be better to skip straight to a solution?

taketwo commented 6 years ago

We have functions in PCL that support both single and double precision through template parameter, e.g. planeWithPlaneIntersection. Maybe the same can be done with the functions involved in your pipeline?

msy22 commented 6 years ago

I guess that'd be the most straight-forward way to do it, then it's mostly automatic. That solution would essentially involve adding a new declaration of the important functions, at the very least the following:

computePointNormal
solvePlaneParameters

Which I assume would also require creating new definitions of the class variables covariance_matrix and xyz_centroid They're defined in normal_3d.h and I had to create double duplicates of those variables to get my hack to work.

In addition, what options would the user have to force the use of the float or double versions? The highest level code the user is going to want to use is the NormalEstimation class. I.e. if this is all the user sees:

pcl::NormalEstimationOMP<PointT, PointT> ne; 
ne.setRadiusSearch(...);
ne.setViewPoint(...);
ne.setInputCloud(...);
ne.compute(...);

How does the underlying code know to use floats or doubles? The effect this bug/issue can have is pretty dramatic (as I'll show in a post below) so I feel like it'd be important to make the user away of the difference and give them the option to force one case or the other.

msy22 commented 6 years ago

So I've been testing matching sequences of 35 clouds, and just incrementally matching each of the 35 back to the first cloud. This bug/issue has a pretty major impact on how well the clouds are matched in some environments. So in all the pictures below, the left hand image is the match BEFORE I fixed this issue, and the right hand image is AFTER fixing it with my hacked code.

In some areas it doesn't make much difference, either because the match was already pretty good or because there isn't much that can improve it:

(NOTE: The pictures are actually pretty big, so click on them for a larger version, rather than squinting at what you see below)

bugfix_carpark_rear

bugfix_carpark_front

But in other areas it is the difference between a good match and a useless one:

bugfix_carpark_diligent

bugfix_office

bugfix_stockpiles

bugfix_marylands

bugfix_zone1

So for my application, using doubles rather than floats makes a world of difference.

taketwo commented 6 years ago

Tentatively, I'd propose the following:

template computeMeanAndCovarianceMatrix and solvePlaneParameters on the scalar type
template computePointNormal on the scalar type, at the same time making covariance_matrix_ and xyz_centroid_ local to that function
add a setter/getter to NormalEstimation to control desired precision
update computeFeature to take precision into account and use corresponding version of computePointNormal.

SergioRAgostinho commented 6 years ago

I can go through and try to validate each stage individually like you suggested, but I think the root cause is definitely the use of floats rather than doubles. So is there value in trying to verify each stage in the process or would it be better to skip straight to a solution?

We should go straight for the solution. Your time is precious and we should use it efficiently.

How does the underlying code know to use floats or doubles? The effect this bug/issue can have is pretty dramatic (as I'll show in a post below) so I feel like it'd be important to make the user away of the difference and give them the option to force one case or the other.

The user will need to explicitly request double precision through template parameters in case it desires it. In the planeWithPlaneIntersection example Sergey linked before, that it achieved with

planeWithPlaneIntersection<double>(...)

@taketwo we should consider defaulting actually to double, since we often have reports of bad normals. The user who wishes speed then can explicitly take the risk of using single fp precision.

taketwo commented 6 years ago

The user will need to explicitly request double precision through template parameters in case it desires it.

I'd like to clarify that I propose template parameter-based selection of precision class only for free functions. For the NormalEstimation class I think it's better to have a setter/getter such that precision can be adjusted at runtime. Reasons: we don't want to double the amount of instantiations of NormalEstimation due to added class template parameter. Further, I believe it's nice to be able to control precision class through a variable that can be changed at runtime (from command line input or config file) versus alternating code and recompiling.

we should consider defaulting actually to double

Do we understand in which cases single is okay, and in which double is needed? Does in have to do with pointcloud extents? Maybe we can have a heuristic that selects precision automatically if it was not set explicitly.

SergioRAgostinho commented 6 years ago

Do we understand in which cases single is okay, and in which double is needed? Does in have to do with pointcloud extents?

That's the current intuition. Whenever points stray away from 0 too much, we start having problems.

Maybe we can have a heuristic that selects precision automatically if it was not set explicitly.

Hmm. I think I'd rather keep it simple for now. But to provide an informed opinion I need to have an intuition on the time penalty we're incurring for jumping into double precision. Something which reports a time metric per normal point and per the local neighborhood size.

taketwo commented 6 years ago

That's the current intuition. Whenever points stray away from 0 too much, we start having problems.

Well, if this is the only reason, then we should just de-mean point neighborhoods before computing covariance matrix. Otherwise, what if points stray even further away from 0? Provide quadruple precision implementation?

taketwo commented 6 years ago

@msy22 can you do us a favor and run the PCD from the linked issue through your modified code? If you get good results then we can close that issue in favor of this one.

SergioRAgostinho commented 6 years ago

Well, if this is the only reason, then we should just de-mean point neighborhoods before computing covariance matrix. Otherwise, what if points stray even further away from 0? Provide quadruple precision implementation?

This brings in the second parallel discussion that happened in #1407, regarding a single vs double pass to compute the mean and the covariance matrix, specifically the errors precision errors incurred on the former.

@msy22 you might want to try first things with the double pass mean and cov estimation. Please into account this proposed approach.

msy22 commented 6 years ago

Ok, first things first - I've downloaded that .pcd file from issue #2372 and put it though the unmodified and modified versions of my normal estimation code (as per @taketwo's request).

I've included some quick results below. For this test I used a point search radius of 2.0 meters since the point distribution is much sparser than my clouds, and has an odd banded structure (I'm guessing this was taken using aerial LiDAR). For an indication of scale, here's a point with a 2m radius highlight in red:

xyz_2m_search_radius

Now as for the accuracy of point normals, here's a comparison. On the left is unmodified (i.e. floats) and on the right is modified (i.e. doubles). As before, click on the pictures for a bigger version.

xyz_test

However, it can still be a little hard telling how much of an effect using doubles has had, so here's the same comparison as above, but zoomed in on the roof of the building so you can see the difference in normals a bit more clearly.

xyz_roof

There are a couple of things that stand out to me about this:

Yes, the unmodified (floats) normal estimation is worse than when doubles are used. However the original images of the cloud on issue #2372 look a little worse and I suspect that this is because K=10 points were used, when maybe a radius or a larger K value would have been better.
The modified (doubles) result is better, but still not as good as I would have expected given the results I'm now getting with my point clouds
My point clouds are offset from the coordinate system center (XYZ = 0,0,0) by only a few hundred meters in either direction, whereas that point cloud is offset by XYZ ~= -971000, -19300, 0. So I think @taketwo is correct about this:

should just de-mean point neighborhoods before computing covariance matrix. Otherwise, what if points stray even further away from 0? Provide quadruple precision implementation?

msy22 commented 6 years ago

So in short, we have three possible causes of poor normal calculations (assuming no other problems come to light):

Use of floats rather than doubles
Long distances from coordinate frame center
Use of a single-pass calculation rather than a double-pass calculation

My results suggest that 1 is a significant contributing factor, but the results from issue #2372 suggest that resolving problem 1 may not be enough, and that problem 2 also needs to be resolved. And as @SergioRAgostinho has already pointed out, the method of doing a double-pass may also improve things.

Without knowing exactly which of the above problems (in any combination) we need to solve in order to get the best normal calculation, we can't confidently provide a proper solution. So I'll implement a de-meaning function and double-pass calculation (following @SergioRAgostinho's suggestion) then put my clouds and the xyz.pcd cloud though everything to see which combination of fixes gives the best result.

Now at this stage I don't have any method of obtaining a "ground truth" for the normals... so aside from eye-balling the images I don't have any way of precisely (in degrees or some other metric) comparing results. If either of you have any suggestions in this area, I'd be glad to hear them. Otherwise, depending on how some meetings go over the next two weeks, I may be able to convince my supervisors to let me spend more time on this problem, and I may be able to do something about obtaining a ground truth...

taketwo commented 6 years ago

@msy22 thanks for giving it a try and also for the detailed analysis.

My results suggest that 1 is a significant contributing factor, but the results from issue #2372 suggest that resolving problem 1 may not be enough, and that problem 2 also needs to be resolved

What would be interesting to see, is whether resolving 2 alone is enough. In other words, is single precision sufficient given that the point cloud is at the origin?

If that is the case, I have an idea how we can get away even with a single-pass single precision algorithm. We don't need to de-demean the point cloud exactly, we only need to bring it close enough to the origin. In the context of normal estimation, we are typically dealing with compact groups of neighboring points, as discovered by a spatial search backend. Thus we can approximate the centroid of such a neighborhood with the query point used for search, or, equivalently, the point at which the normal is being estimated. (I would even speculate that any point in the neighborhood will be good enough). This is trivial to implement and amounts for a one additional SSE instruction per point (to subtract the "central" point), and then one more to adjust the computed centroid.

What do you think? Any reasons why this would not work?

If either of you have any suggestions in this area, I'd be glad to hear them.

I'd simply go with synthetic planar point clouds translated away from the origin. If time permits, we can add a unit test based on the real data. For example, crop the ground in one of your LIDAR point clouds, fit a plane, and test whether resulting normals are close enough to plane. By the way, #1407 includes additional unit test which can be re-used.

msy22 commented 6 years ago

Thanks @taketwo!, I'll address that shortly, although I don't have much experience with the PCL code base and virtually none with SSE, so I may be less help than you think haha.

But before I get too behind, here is the results of my testing.

Method

To de-mean my point clouds, I calculated the centroid and applied a transformation like so:

Eigen::Vector4d centroid;
Eigen::Affine3d transform = Eigen::Affine3d::Identity();
compute3DCentroid(*point_cloud_ptr, centroid);
transform.translation() << -centroid(0), -centroid(1), -centroid(2);
pcl::transformPointCloud(*point_cloud_ptr, *point_cloud_ptr, transform);

Although I've since realized there's a demeanPointCloud function in centroid.h that I could have used but never mind...

To implement the double-pass covariance calculation method, I used a slightly modified version of the code snippet @SergioRAgostinho posted:

  inline int
  computeMeanAndCovarianceMatrixDoublePass (const pcl::PointCloud<PointInT> &cloud, const std::vector<int> &indices, Eigen::Matrix3f &covariance_matrix, Eigen::Vector4f &centroid, bool double_pass)
  {
      if (double_pass)
      {
        pcl::compute3DCentroid(cloud, centroid);
        return(pcl::computeCovarianceMatrixNormalized (cloud, centroid, covariance_matrix));
      }
      else
        return(pcl::computeMeanAndCovarianceMatrix(cloud, covariance_matrix, centroid));
  }

To implement the double precision calculations I used the method I've already described (just a local copy-paste of a few functions redeclared with doubles instead of floats).

Results

Here's the visual result for the whole cloud:

xyz_normal_accuracy_comparison

And again zoomed in on the roof, which seems to be the clearest place in the cloud for inspecting normals.

xyz_roof_normal_accuracy_comparison

Conclusions

To me the clear winner here is de-meaning the whole point cloud.
This conclusively proves your point @taketwo that double precision is not enough by itself if the cloud is still far away from the origin.
Interestingly, the double-pass calculation doesn't seem to have done much (provided I implemented it correctly) and it actually took 3-4 times the normal amount of time to calculate the normals. So I think this solution can be ruled out.

msy22 commented 6 years ago

Just to add to the above, de-meaning my point cloud solves the problem as well, the image below shows the unmodified cloud on the left and the de-meaned but still calculated using single (float) precision cloud on the right.

cloud122_compared

For reference, the centroid I subtract from this cloud is XYZ = 583.75, 433.07, 16.7171. Also note that I've been de-meaning my point clouds before putting them into the visualizer for a while now, since the viewer is bloody impossible to control well if the cloud is far from the origin. But until now I'd never de-meaned the cloud before calculating the normals.

So to answer your question @taketwo:

is single precision sufficient given that the point cloud is at the origin?

Yes, it looks like it might be. I've run my code with both de-meaning and double precision, and visually I can't tell the difference between the result and just de-meaning. So there's probably no way to tell the difference between the two (if there is any) without a mathematical proof or a metric for measuring normal accuracy.

As for whether it'd be enough to de-mean the whole cloud or each cluster individually, that's another matter. And the synthetic planar clouds is a good suggestion. It'd be easy enough to create a basic environment and the associated points using the method shown in the PCL interactive ICP tutorial. It's certainly be a nice way to more thoroughly validate everything we've discussed, and a good way to test other parameters that might affect normal calculation.

SergioRAgostinho commented 6 years ago

If either of you have any suggestions in this area, I'd be glad to hear them.

I'd simply go with synthetic planar point clouds translated away from the origin. If time permits, we can add a unit test based on the real data. For example, crop the ground in one of your LIDAR point clouds, fit a plane, and test whether resulting normals are close enough to plane. By the way, #1407 includes additional unit test which can be re-used.

If you feel like undergoing a more challenging synthetic example, computing normals from points densely sampled from a spherical surface should also do the trick.

Thanks @taketwo!, I'll address that shortly, although I don't have much experience with the PCL code base and virtually none with SSE, so I may be less help than you think haha.

@taketwo might have abstracted you from SSE details with #2247. Nevertheless, if you hit a problem, ask for opinions/options and we'll always chime in.

As for whether it'd be enough to de-mean the whole cloud or each cluster individually, that's another matter.

I would demean on a per cluster basis. (Assuming the demeaning operation to be practically inexpensive).

On a side note: you have my praise @msy22 . Great analysis and presentation. We don't get that often so... take my kudos.

taketwo commented 6 years ago

Interestingly, the double-pass calculation doesn't seem to have done much (provided I implemented it correctly) and it actually took 3-4 times the normal amount of time to calculate the normals. So I think this solution can be ruled out.

What you have done looks correct. But the result is surprising. For the record, #1407 includes a different take on two-pass algorithm. If you have time and interest, you can give it a try.

As for whether it'd be enough to de-mean the whole cloud or each cluster individually, that's another matter.

Exact de-meaning requires two passes and we don't want that. Approximate de-meaning (as per my suggestion) yields better and better approximations the smaller the cloud is. So it's best to perform it at the neighborhood cluster level.

I don't have much experience with the PCL code base and virtually none with SSE, so I may be less help than you think haha.

I did not mean that we need to code SSE explicitly, hopefully the compiler will do the job. I just wanted to stress it'll hopefully map to a single operation, thus won't be expensive.

Specifically, in this code:

https://github.com/PointCloudLibrary/pcl/blob/90750665cd2df4cc39def35f3e1bc00cc9639f3e/common/include/pcl/common/impl/centroid.hpp#L564-L577

Just compute "approximately" de-meaned point:

const Eigen::Vector4f point = cloud[*iIt].getVector4fMap() - cloud[indices[0]].getVector4fMap();

And use it further to update accu. In the end, adjust centroid to account for this shift.

msy22 commented 6 years ago

If you feel like undergoing a more challenging synthetic example, computing normals from points densely sampled from a spherical surface should also do the trick.

That's a good suggestion. If I'm going to do this properly I should start with simple primitives like a cube, plane or sphere and work my way up through more complex objects to real data. But that will all have to wait until after this issue is resolved.

On a side note: you have my praise @msy22 . Great analysis and presentation. We don't get that often so... take my kudos.

Thanks! I appreciate that!

@taketwo might have abstracted you from SSE details with #2247. Nevertheless, if you hit a problem, ask for opinions/options and we'll always chime in.

I did not mean that we need to code SSE explicitly, hopefully the compiler will do the job. I just wanted to stress it'll hopefully map to a single operation, thus won't be expensive.

Great, thanks for the link! And I'll do my best to get it working.

What you have done looks correct. But the result is surprising. For the record, #1407 includes a different take on two-pass algorithm. If you have time and interest, you can give it a try.

That's why I was concerned that maybe I hadn't implemented it properly. I'll have a quick crack at the other method before moving on. Because the two-pass method was central to that PR, it would be good to be able to definitively rule out/in that method, otherwise it's a loose thread.

I would demean on a per cluster basis. (Assuming the demeaning operation to be practically inexpensive).

As for the subject of demeaning the whole cloud vs individual clusters, would demeaning just the cloud be faster and still provide the same benefit? I may end up editing this later today after I learn more, but my naive assumption is that if we de-mean each cluster that will result in more operations, depending on how it's implemented.

Say for the sake of the argument that we de-mean each cluster around the point we're calculating the normal for, and that we're doing a K-nearest point search where K=10. We de-mean that point and it's 10 nearest neighbors. Then we calculate the normal for the next point in the cloud which is likely just the nearest neighbor of the first point. We repeat the same process and end up demeaning basically the same cluster of points all over again. So we essentially end up de-meaning the same point multiple times in slightly different ways. My worry is that an implementation like this could result in orders of magnitude more calculations than simply de-meaning the whole cloud (i.e. each point once).

In the mean time, I'll look at the alternative double-pass calculation. Then I'll fork the PCL repo, start familiarizing myself with the necessary code, and write up a plan/to-do list which I'll post here so you can both sanity check it.

SergioRAgostinho commented 6 years ago

As for the subject of demeaning the whole cloud vs individual clusters, would demeaning just the cloud be faster and still provide the same benefit?

Potentially not. If your entire point cloud is very uniform, the centroid is at the origin but there are still points at far away positions, you'll still end up with the same original problem. This is why "demeaning" needs to be done on a per cluster basis.

So we essentially end up de-meaning the same point multiple times in slightly different ways. My worry is that an implementation like this could result in orders of magnitude more calculations than simply de-meaning the whole cloud (i.e. each point once).

You're correct. But the underlying assumption intuition is that this operation is inexpensive. Of course we now need to benchmark the processing time to confirm this hypothesis.

taketwo commented 6 years ago

There are many factors involved and only benchmarking can show what is faster. But here is a speculative argument why "simply de-meaning the whole cloud" approach may be slower. For the sake of the argument, let's assume that we even know the centroid in advance. Now, note that we can not de-mean the point cloud in-place, which means that a new cloud of the same size has to be allocated. Then we need to make a pass through all points (read access), perform subtraction, and write back the results (write access). And then move on and compute covariance matrices for each cluster. So the overhead of "simple de-meaning" is an allocation and a read/write access to all points. Compared to that, per-cluster de-meaning needs a single pass and no additional memory. And we all know that memory operations are waaaay slower than arithmetics, even if we are lucky with cache hits.

msy22 commented 6 years ago

So the overhead of "simple de-meaning" is an allocation and a read/write access to all points. Compared to that, per-cluster de-meaning needs a single pass and no additional memory. And we all know that memory operations are waaaay slower than arithmetics, even if we are lucky with cache hits.

Aaaah, that's really valuable to know. Ok it is less straight-forward then. (For reference, only one 3rd of my background is in software, and operation costs/code optimization is a gap in my knowledge I haven't had time to fill yet).

You're correct. But the underlying assumption intuition is that this operation is inexpensive. Of course we now need to benchmark the processing time to confirm this hypothesis.

OK, judging by PR #2247 it looks like you already have processes in place for testing the computation time of different bits of code. So then what I need to do is implement both solutions quickly and roughly so you two can then test it and prove with data which of the two options is faster.

If looks like the best place to de-mean the whole point cloud would be somewhere in NormalEstimation::computeFeature, before it starts iterating through the cloud. This means editing normal_3d.hpp and/or normal_3d_omp.hpp. As per your suggestion @taketwo, de-meaning each cluster looks like it should happen in centroid.hpp, and for now maybe just in the specific version of computeMeanAndCovarianceMatrix I know will be used. I'm not sure what you'd need for testing, but adding a boolean input to toggle on/off de-meaning would probably do for now.

So what would be best for you two? Would it be better for me to modify a fork of the PCL code base so you can clone, compile and test that? Or would it be easier and faster if I do something like what I've done to get this far - copying the relevant functions into a custom version of normal_3d_omp.h/.hpp and changing them appropriately? So all you'd need is one extra .h and one .hpp file.

taketwo commented 6 years ago

OK, judging by PR #2247 it looks like you already have processes in place for testing the computation time of different bits of code.

Well, we have that one instance when we did extensive benchmarking. However, this is not an established process.

I have created a new project for benchmarking and testing mean/covariance computation. It's largely based on the repository from #2247. It's just a skeleton, but it should get you started. In future there is a bunch of things that would need to be verified, e.g. whether number of samples/iterations in adequate, whether we are indeed benchmarking what we want to benchmark (check assembly), make the size of neighborhoods parametrizable, etc.

msy22 commented 6 years ago

Finally got some more time to work on this - I've decided that it may be easier to just roll the double-pass calculation into the standard tests we want to run.

@taketwo, where is the best place to discuss the testing repo? I've got some questions about how it works, and some suggestions as to how to run the tests. Since the whole testing procedure is kind of it's own topic now I'm not sure if it makes sense to discuss it in the issue tracker.

taketwo commented 6 years ago

Yeah, I think it may be more convenient to carry on benchmarking/testing discussions in that repository and then come back here with the results.

msy22 commented 6 years ago

Agreed. Github doesn't really have a general discussion board for repositories, so would it make sense to use the issue tracker in that repo to discuss issues by topic? E.g. one issue for test data sets, another for the types of test to run etc etc...

taketwo commented 6 years ago

There are no rules or golden standards ;) Do as you deem appropriate!

msy22 commented 5 years ago

As an FYI I have not forgotten about this issue, I've just been sidetracked by other work. I think I've finally nailed exactly what is going on, and can explain it best with this diagram which shows how PCL computes normals, and how the problem occurs:

point normal calculation flow diagram with marked errors

So although the root cause of the problem is catastrophic cancellation in computeMeanAndCovariance (caused by large coordinates), how the problem manifests itself has a lot to do with how the eigen33 and computeRoots functions operate. I have confirm this by printing out the eigenvalues, so while a test with a cloud far from the origin:

bad_normals_xyz_cloud

Produces an output like this:

Eigenvalue: 0
Eigenvalue: 0
Eigenvalue: 0
Eigenvalue: 0
Eigenvalue: 0
Eigenvalue: 0
Eigenvalue: 0
Eigenvalue: 0.0342681
Eigenvalue: 0
Eigenvalue: 0
Eigenvalue: 0
Eigenvalue: 0
Eigenvalue: 0
Eigenvalue: 0
Eigenvalue: 0.0892916
Eigenvalue: 0
Eigenvalue: 0

I.e. proving that the covariance matrix input to eigen33 is not positive semi-definite. However if the same cloud is de-meaned to center it on the origin of it's coordinate system, it produces this output:

Eigenvalue: 0.000220603
Eigenvalue: 0.000293271
Eigenvalue: 0.000260776
Eigenvalue: 0.000235714
Eigenvalue: 9.64799e-05
Eigenvalue: 0.000239345
Eigenvalue: 0.000338112
Eigenvalue: 0.000759444
Eigenvalue: 0.000444821
Eigenvalue: 0.000256216
Eigenvalue: 0.000238943
Eigenvalue: 0.000115213
Eigenvalue: 0.000148883
Eigenvalue: 0.00014674
Eigenvalue: 6.64685e-05
Eigenvalue: 0.000103244
Eigenvalue: 0.000109216

Although the fact that the second output produces eigenvalues which are close to zero leads me guess that catastrophic cancellation may be related less to the distance of the cloud from origin, and more to the size of that distance relative to the spacing between points.

Unfortunately recent events have forced me to expedite my PhD, so although I still intend to write a short paper on this issue, I am simply not going to have the time to make any fixes/improvements to the PCL trunk. And since the problem is partially related to code more fundamental and lower down than just computeMeanAndCovariance really dealing with this issue may require modifying the behavior of eigen33 and/or computeRoots. So in the meantime, I suggest that the best short-term solution is just to warn people that any point cloud that is any significant distance from the origin of it's coordinate system will not have reliable point normals.

SergioRAgostinho commented 5 years ago

Great presentation (as usual). From reading the diagram seems likes we need to dig into the output of computeMeanAndCovariance. But before that, I need some help understanding all these eigenvalues you're printing.

The number of eigenvalues of the covariance matrix of a group which spans a full 3D space should be 3. In this case it seems you're showing only one per point (?). Since you want to estimate normals based on a local neighborhood of points, you're usually interested in the eigen vector associated with the lowest eigen value of the covariance matrix of this local neighborhood. The closer this eigen value is to zero, the more it's saying the point's local neighborhood resembles a (hyper)plane. So this is not a problem for me.

Now, we definitely need to ensure that covariance matrix is symmetric and positive semifinite, because eigen33 exploits this property to accelerate computation. Out of these two properties, the most important is by far the symmetry, which ensures we can do EVD. Ensuring that is relatively cheap if we enforce 0.5 * (cov + cov.T) (using numpy notation).

msy22 commented 5 years ago

Thanks, and sorry I should have been clearer about that. Yes I am only printing out one eigenvalue - the smallest. Specifically I am printing out the eigenvalue eigen33 returns.

The closer this eigen value is to zero, the more it's saying the point's local neighborhood resembles a (hyper)plane.

Ah, that makes sense, that bit of information hadn't clicked in my head yet. That's an important fact since what we want then is the lowest eigenvalue to be as close as possible to zero, without actually going too far and becoming negative.

As for symmetry, computeMeanAndCovariance essentially enforces symmetry with lines like this that explicitly sets the off-diagonal elements equal to each other. So I think that property is safe.

To be clear, I'm not 100% sure about all this. The flow diagram I posted is currently my best guess as to what is happening. I think that the problem occurs when the output of computeMeanAndCovariance produces a matrix with a negative eigenvalue, this line in computeRoots sets that to zero but (and this is the important part) the original matrix remains unchanged. So you now have an eigenvalue that doesn't actually have a corresponding eigenvector. So when eigen33 calculates the eigenvector, it's using an eigenvalue that the co-variance matrix doesn't actually have, and is obviously going to produce an incorrect result.

Again, this is my current best hypothesis. I'm happy to be proven wrong on any and all of this. I think catastrophic cancellation is the culprit here based on two key observations: The normal orientation gets worse as

The point cloud moves further from the origin
The more closely packed the points are

I'm currently working on a way to rigorously prove this, and I'll hopefully have more results by the end of today or tomorrow.

msy22 commented 5 years ago

Ok so ended up changing how I calculated everything, which is why this took so long. So here's the short version...

I took the Stanford Bunny:

And pushed it away from the origin along all three axes so that this started happening:

At the same time, I scaled up the size of the point cloud to increase the mean point spacing, which I defined as the mean Euclidean distance between the point and it's k-nearest neighbors, which were the same k neighbors used to calculate the normal. At each scale and distance I then calculated the normal deviation from the ground truth and plotted everything on a scatter graph, where each marker is the stats for a single point in the cloud. This is the result:

Which proves three points:

The normal deviation error is a function of distance from origin and mean point spacing
As the ratio of distance over spacing increases, the likelihood of the calculated normal deviating from the true normal increases.
Past a certain threshold for this ratio (basically everything in the bottom-right triangle of the plot) the angle of deviation is essentially random.

I won't post anything more until I've got a rough paper draft, at which point I will make it available to @SergioRAgostinho and @taketwo as a courtesy, and to check the tone. I want anything I publish to be seen as documenting a problem, not an attack or criticism of PCL.

msy22 commented 5 years ago

I know I said I wouldn't post anything more until I had a draft, but the results I'm getting when testing cloud registration are raising some unpleasant questions. To get the graph below, I pushed the Stanford Bunny further and further from the origin along all three axes, and at each step, copied the bunny, shifted the copy 0.4 units along the x axis and then matched it back to the original. The registration error is the mean point-to-point distance. After a distance of 350, the Point-to-plane ICP results go off the top of the graph, and the same thing happens to Point-to-plane ICP when the normals are calculated correctly after about 1000. Incidentally, now that I understand GICP properly I realize that it uses the covariance matrices directly and isn't affected by incorrect normal orientations at all (confirmed with separate tests).

The ordinary ICP algorithm seems to handle everything just fine. But the point-to-plane ICP and GICP algorithms perform very poorly very quickly. I've tested this with a few other point clouds at a variety of initial errors, but the results always look like this. Also for reference I enlarged the bunny to dimensions of 2x1.5x2 so at an initial error of 0.4 there was plenty of overlap:

Short of chalking this up to numerical imprecision and loss of significance, @SergioRAgostinho and @taketwo do either of you know of any issues or bugs in the ICP code base that could cause this sort of behavior? Unfortunately I simply do not have the time to repeat this whole investigation for the ICP variants.

taketwo commented 5 years ago

Sorry I've been struggling to find time to have a deeper look into this. I hope to spare some time this weekend and write a meaningful reply.

msy22 commented 5 years ago

No worries, we're all busy people so I appreciate whatever time you can spare.

I'm finding that performance is very dependent on the type of cloud (e.g. closed surface like the Stanford bunny or a more open surface like the Velodyne HDL-32 scans I use). The point-to-plane algorithms seem to struggle with the Bunny especially for some reason, so I have a better set of results with this pair of HDL-32 scans: ods_carpark_ab_set

Where again, I start with an A (red) cloud and a B (blue) cloud that are already perfectly positioned relative to each other (using GPS). I copy B, add error and then match B_error to A. The registration error is then the mean point-to-point distance between B and B_error.

registration_error_hdl_32_various_initial_errors

I'm pretty happy with this set of results since it confirms (at least for point-to-plane ICP anyway) that if the normals are calculated correctly then the registration error doesn't get worse with distance. What I can't explain is why ICP and GICP both contain discrete jumps in the registration error. Right now I don't have a better explanation than loss of significance, assuming my use of GICP is correct (all parameters use the default values):

void inline
GeneralizedIcp (PointCloudT::Ptr target_cloud,
                PointCloudT::Ptr source_cloud)
{
  // Create the necessary ICP objects
  pcl::GeneralizedIterativeClosestPoint<PointT, PointT> gicp;
  pcl::search::KdTree<PointT>::Ptr tree (new pcl::search::KdTree<PointT>);
  gicp.setSearchMethodTarget(tree);

  // Set various parameters
  gicp.setMaxCorrespondenceDistance(MAX_CORR_DIST);
  gicp.setRotationEpsilon(ROTATION_EPSILON);
  gicp.setCorrespondenceRandomness(CORRESPONDENCE_RAN);
  gicp.setRANSACIterations(RANSAC_ITERATIONS);

  // Set termination criteria
  gicp.setMaximumIterations (MAX_ITERATIONS);
  gicp.setTransformationEpsilon(TRANFRM_EPSILON);
  gicp.setEuclideanFitnessEpsilon(FITNESS_EPSILON);

  // Input clouds
  gicp.setInputSource (source_cloud);
  gicp.setInputTarget (target_cloud);
  gicp.align (*source_cloud);
}

taketwo commented 5 years ago

Short of chalking this up to numerical imprecision and loss of significance, @SergioRAgostinho and @taketwo do either of you know of any issues or bugs in the ICP code base that could cause this sort of behavior?

I am not aware of any issues or bugs of this sort, so I'd probably also blame it on numerical precision. Nevertheless, it would be nice to understand which specific part of the ICP procedure is susceptible to this problem. My first suspect would be SVD-based transform estimation, which internally uses Umeyama method. It does seem to perform demeaning, but maybe the sum "overflows"? Anyway, I quick try would be to up the precision used in ICP. There is a third (optional) template parameter that controls it.

I am simply not going to have the time to make any fixes/improvements to the PCL trunk.

That's unfortunate, but as you said, we're all busy people. Thanks for the effort you put into this anyway.

For future reference, could you summarize your experience (if any) with trying to implement per-cluster demeaning? Back in September we seemed to have a consensus that this will solve the problems.

msy22 commented 5 years ago

I am not aware of any issues or bugs of this sort, so I'd probably also blame it on numerical precision.

I think this is what I will have to do unless I find something obvious, simply because debugging every inch of the PCL ICP code base is infeasible to say the least. I appreciate all those suggestions, I will have a quick look into them and let you know if I find anything.

Anyway, I quick try would be to up the precision used in ICP. There is a third (optional) template parameter that controls it.

Ah, I hadn't noticed that detail, thanks for that! I do notice that regular ICP includes that option, as does the PICP variant I am using (link). But the GICP implementation does not include this parameter (link). I will put it on my to-do list as that could be a nice way to confirm numerical precision is to blame without debugging the entire PCL ICP code base.

For future reference, could you summarize your experience (if any) with trying to implement per-cluster demeaning? Back in September we seemed to have a consensus that this will solve the problems.

I have only tested this with normal calculation. In the graphs in my post prior to this one, I have two PICP (point-to-plane ICP) datasets. For the first data set (PICP, light blue) I calculated the normals using the normal process with this function:

GENERIC NORMAL CALCULATION

```C++ template void GenericKNearestNormalCalculation(CloudPtr cloud_in, int k) { // Create normal calculation objects pcl::NormalEstimationOMP norm_est; // May not work pcl::search::KdTree::Ptr tree (new pcl::search::KdTree); // Set search parameters norm_est.setSearchMethod (tree); norm_est.setKSearch(k); norm_est.setViewPoint (0.0, 0.0, 0.0); norm_est.setInputCloud (cloud_in); // Compute normals norm_est.compute (*cloud_in); } ```

\ For the second data set (PICP (correct normals), dark blue) I de-meaned each cluster (each point i plus its k-nearest neighbours) before calculating the normal, using this function here:

DE-MEANED NORMAL CALCULATION

```C++ template inline void CalculateNormalGroundTruth(CloudPtr cloud_in, int k) { // Create necessary variables PointCloudT::Ptr tmp_cloud (new PointCloudT); // tmp cloud for demeaning tmp_cloud->width = k + 1; tmp_cloud->height = 1; tmp_cloud->points.resize(k+1); std::vector nn_indices(k); std::vector nn_distances(k); Eigen::Vector4f n; // normal float curvature; Eigen::Affine3d transform = Eigen::Affine3d::Identity (); Eigen::Vector4d centroid; compute3DCentroid(*cloud_in, centroid); // Set up the k-d tree pcl::search::KdTree::Ptr kdtree (new pcl::search::KdTree); kdtree->setInputCloud(cloud_in); /* Iterate through each point. Note we're only calculating the ground-truth normals for points we'll acutally be looking at later*/ for (int i=0; i < cloud_in->points.size(); i++) { // Find the k nearest neighbours kdtree->nearestKSearch (cloud_in->points[i], k, nn_indices, nn_distances); // Copy the point and its nearest neighbours into a temporary cloud for (int j=0; j < nn_indices.size(); j++) { tmp_cloud->points[j].x = cloud_in->points[nn_indices[j]].x; tmp_cloud->points[j].y = cloud_in->points[nn_indices[j]].y; tmp_cloud->points[j].z = cloud_in->points[nn_indices[j]].z; } // Also copy in the point we want the normal for at the end tmp_cloud->points[k].x = cloud_in->points[i].x; tmp_cloud->points[k].y = cloud_in->points[i].y; tmp_cloud->points[k].z = cloud_in->points[i].z; // De-mean the whole tmp cloud by point i's coordinates to center it on 0,0,0 transform.translation() << -cloud_in->points[i].x, -cloud_in->points[i].y, -cloud_in->points[i].z; pcl::transformPointCloudWithNormals(*tmp_cloud, *tmp_cloud, transform); // Compute the point normal pcl::computePointNormal(*tmp_cloud, n, curvature); // Set the normal to the result cloud_in->points[i].normal_x = n[0]; cloud_in->points[i].normal_y = n[1]; cloud_in->points[i].normal_z = n[2]; // Flip normals so they consistently point in one direction (towards centroid) flipNormalTowardsViewpoint (cloud_in->points[i], centroid[0], centroid[1], centroid[2], cloud_in->points[i].normal_x, cloud_in->points[i].normal_y, cloud_in->points[i].normal_z); } } ```

\ This second functions correctly calculates the normals at any distance, which I have visually confirmed with the PCL visualizer. The PICP results with correct normals are very consistent at every distance, so I think this has effectively solved the problem for PICP (provided it isn't actually a bug causing this).

msy22 commented 5 years ago

On a side note, the point-to-plane ICP registration keeps failing when I try matching two of the original scans of the Stanford Bunny (specifically scan 315 and 45):

I keep getting the error

[pcl::IterativeClosestPoint:;computeTransform] Not enough correspondences found. Relax your threshold parameters.

And when that happens, registration fails:

Note the initial error between the two clouds was 0.1, and the point-to-plane ICP (PICP) algorithms just fling the two clouds apart.

Based on the code here in icp.hpp suggests that not enough correspondences are being found, but the block of code above that (here) which actually computes the correspondences suggests that the only variables you can change is the correspondence distance. I've tried increasing the correspondence to an arbitrarily large number but this still happens

At this point I'm not sure what is being caused by bugs, numerical precision/loss of significance or incorrect implementation on my part. If this looks like it could be a bug, I'll create a separate issue for it.

taketwo commented 5 years ago

According to my understanding, one has to find a balance between setting max correspondence distance large enough that at the beginning of the registration process at least some correspondences can be established, and setting it small enough that it's not the case that everything can be a correspondence to everything.

What usually works for me is setting a conservative max correspondence distance and providing a good initial pose guess to the align() function.

msy22 commented 5 years ago

Thanks for the advice, I'll play around with the correspondence setting a bit more then. As for the initial guess, I've tried a wide variety of values but it didn't prevent the error message from popping up.

SergioRAgostinho commented 5 years ago

Wow I'm very late to this thread. I'll try to find some time next week to provide some meaningful feedback.

msy22 commented 5 years ago

I'll try to find some time next week to provide some meaningful feedback.

No worries, I managed to get a fair bit done over the Xmas break and as always I appreciate whatever time you can spare.

I now have a first draft of my paper concerning all this, and as a courtesy to you both @taketwo and @SergioRAgostinho , I'd like to make the draft available so you have an opportunity to read it and check the tone. As I've said before, I want anything I publish to be seen as documenting a problem, not attacking or criticizing PCL.

With that said, would either or both of you like a copy of the draft? And if so, where should I email it? (I see an email address on your blog @SergioRAgostinho, but you may prefer a different address)

SergioRAgostinho commented 5 years ago

Sure send it in, on whatever address. I'll be glad to read it and provide feedback.

As I've said before, I want anything I publish to be seen as documenting a problem, not attacking or criticizing PCL.

From my side it's totally fine. Even if you wrote something bashing it completely. Everything is open to criticism. This library is everything but perfect and we get improvement ideas from feedback, regardless of the type.

taketwo commented 5 years ago

You may include me (alexandrov88, google mail) as well.

msy22 commented 5 years ago

Even if you wrote something bashing it completely. Everything is open to criticism.

That's very open-minded of you, which I appreciate. Regardless of it's flaws PCL is still a great library to have, and I don't want to put anyone off it with unnecessarily negative criticism.

I've sent an email to you both, let me know if you didn't receive it.

msy22 commented 5 years ago

Following up on a suggestion from @SergioRAgostinho, I calculated the co-variance matrices for a few points from scan 45 of the Stanford Bunny data set, once at the original coordinates, and then again after translating point i and its k-nearest neighbours (k=15) 1000 meters in all three axes.

I also calculated the eigenvalues (using eigen33 and symbolab's online calculator) with the goal of proving @SergioRAgostinho's point that bad normals occur because the co-variance matrix is wrong, not because eigen33 clamps the smallest eigenvalue to zero if less than zero. This is the result using four test points:

cv_matrices

The point here is that in each test case, the co-variance matrix (and therefore everything else) is very different at origin to at distance. As for what is happening with the eigenvalues, my linear algebra is quite rusty so I'm still working to interpret what's going on...

... in fact testing a few more points shows that eigen33 often doesn't return the same numbers as traditional methods of calculating the eigenvalues (mostly a variety of online calculators because calculating it by hand is tedious)...

PointCloudLibrary / pcl

PCL point normals not always calculated correctly in sparse clouds #2403

Your Environment

Context

Expected Behavior

Current Behavior

Code to Reproduce

Possible Solution

Method

Results

Conclusions