opencv / opencv_contrib

Repository for OpenCV's extra modules
Apache License 2.0
9.42k stars 5.76k forks source link

surface_matching icp fails due to under-determined linear system #464

Open ahundt opened 8 years ago

ahundt commented 8 years ago

I'm testing surface_matching using ppf_load_match and the following model + scene:

cereal model cereal_box

cereal scene cereal_scene The relevant ply files are in cereal_scene.zip.

When I run this with the latest master of opencv and opencv_contrib as of about 2015-12-06-1200, minimizePointToPlaneMetric() fails due to cv::solve() being passed an under-determined linear system at the linked line. In this case A is 4x6 when it should be at least 6x6 to find a solution, which I believe means there are only 4 point pair correspondences being found.

Here is the relevant debug output when run on cereal_box_accurate_scale_lowpoly.ply and cerealscene2.ply:

****************************************************
* Surface Matching demonstration : demonstrates the use of surface matching using point pair features.
* The sample loads a model and a scene, where the model lies in a different pose than the training.
* It then trains the model and searches for it in the input scene. The detected poses are further refined by ICP
* and printed to the  standard output.
****************************************************
Running on 64 bits
Running without OpenMP and without TBB
Training...

Training complete in 0.00605122 sec
Loading model...

Starting matching...

PPF Elapsed Time 0.0220278 sec
Number of matching poses: 10
Performing ICP on 2 poses...
OpenCV Error: Bad argument (The function can not solve under-determined linear systems) in solve, file /Users/athundt/source/git/Itseez/opencv/modules/core/src/lapack.cpp, line 1210
libc++abi.dylib: terminating with uncaught exception of type cv::Exception: /Users/athundt/source/git/Itseez/opencv/modules/core/src/lapack.cpp:1210: error: (-5) The function can not solve under-determined linear systems in function solve

I'm interested in help/feedback to fix this bug, but I'm planning to look into it myself as well. Thanks!

ahundt commented 8 years ago

@bmagyar @tolgabirdal you will probably be interested in this problem, thanks for taking a look.

tolgabirdal commented 8 years ago

This can surely happen, especially when you have ambiguous surface structures (which your box definitely has), bad initial poses (I will see when I look through) or insufficient number of points (e.g. Even if you have enough input points to ICP, if you go very coarse in the pyramid, you might end up with very few. Just keep that in mind and try with less number of levels). Also consider the fact that after sampling, if you lose the surface variation, and end up with dominantly sampling the largest plane, then theoretically you have little chances of registering them.

By the way, in such cases, using this method (surface matching) would not be the best option. It is generally designed for applications where surface geometry is descriptive, like CAD models and etc. Because this object is symmetric, even if everything goes right, you have a huge chance of finding it in up-right poses from time to time.

You might as well simply not use ICP for this if the pose output from the detector is sufficient. Another option is to use point to point ICP, since your surface normals are not describing much (they are similar all over the place).

Finally, make sure that your surface normals are there and correct. Not specific to your scenario, but generally this is a common mistake I see.

Still, I will run it and see.

ahundt commented 8 years ago

The primary bug I believe should be fixed is that even if the data isn't good enough it should not crash, but rather return an error code, throw an exception, or some other reasonable behavior so that the software using your library can at least deal with the problem even if there are no useful results. There isn't really a way to do that ahead of time.

Thanks for taking a look, from reading the papers I'm also aware boxes aren't exactly ideal for this algorithm but was trying to use something I had that a kinect v2 could capture easily with a good size. I'm not particularly tied to that model and it is easy to change the number of triangles in it with meshlab, which also generated the surface normals. I included both the high and low polygon count versions in the zip file.

Questions:

I've found complex ones time consuming to make and simple boxes I made by hand sometimes only have 6 vertices and are poor fits for the algorithm. I've been reading in models with Autodesk 123d catch using images of real world objects, which is also definitely not perfect.

I'll check the following:

bmagyar commented 8 years ago

+1 for failing gracefully.

I'm just going to copy my comment over from the other pull request for completeness since it applies to this one as well. The error case could fail by using the OPENCV_ERROR macro, like here: https://github.com/Itseez/opencv/blob/master/modules/videoio/src/cap_qt.cpp#L314

But in order to fail nicely, we need to identify the problematic cases and should lay down some tests to document, showcase and monitor these and the expected behaviour.

ahundt commented 8 years ago

A good spot to make the graceful crashing change is at if (selInd), because if that is < 6 it will definitely fail.

If it is possible it would also be nice if the algorithms can be divided into separate functions, one for each of the different major steps in the algorithm, rather than being a single 200 line function. Extra bonus points for references to the equation numbers in the original paper!

tolgabirdal commented 8 years ago

I most certainly agree and it should be there for future releases. Please keep in mind that this is a preliminary implementation as part of GSoC. OpenCV decided not to invest in it so much, in the following GSoC. The issues are more than meets the eye. So don't expect a lot from it. Yet, for general cases, it should work as the algorithm is correctly implemented. Having said that, I have the implementation greatly improved already. I will complete the pull request in a timely manner, within my tight schedules.

I have also not integrated the hypothesis verification. So sometimes the correct pose could be not in the first element, but in the subsequent ones.

Regarding your questions:

A. The crash is not only due to insufficient number of points. Sometimes the system really gets ill conditioned and this should be handled. That doesn't stop it from being handled, though.

B. CAD model preparation is indeed crucial for many object detection algorithms. It is not specific to this one. For Kinect like scenarios, you could use Kinect-Fusion to create CAD models.

In general, the CAD models can even be found on the internet. Or for industrial objects, they are also available from manufacturer and etc. If not, you could always 3D print objects. But, keep the following in mind:

1) To make an arbitrary CAD model suitable to be used with this algorithm, you could use remeshing or a triangle sampling. In both cases, distribute the vertices as uniformly as possible (keep in mind that distance quantization requires a certain distance to be present between sample points. This is tied to your relative sampling parameter. I believe this is well explained in the documentation). Keep in mind that sharp edges are generally present in CAD models. Your sampling algorithm should handle this, and should generate correct normals.

In this particular case, choosing the corners of the box isn't really helpful, but having points on all sides is. You can start with Poisson Disk Sampling for example. It is the most basic one, as mentioned here: http://www.tbirdal.me/downloads/birdal_3dv_2015.pdf

2) Accuracy and "reality resemblance" of surface normals are important. If they are too detailed, the reality is missed and shape representation is jeopardized. If they are too smooth, they start not carrying enough information for surface variability. The algorithm has certain tolerance for both the former and the latter, but it has limits to it. For instance, if your sensor is too smooth (like Kinect) and your cad model is very detailed, one trivial approach would be to smooth the CAD model before training.

C. A bad initial pose changes from object to object. For box like cases, if the ICP algorithm is outlier aware and you start from a pure translational shift, you will probably get stuck there. That's because the algorithm already has a matching large portion, and discards the rest. While for a Stanford bunny, you could recover ~30deg rotations.

Also note that the current implementation has certain inefficiency in training stage, which makes it slow. So please be patient to it. This issue is also to be addressed in future releases.

ahundt commented 8 years ago

Thanks for the feedback on how to improve the results. We've tried a few objects and aren't getting any accurate matches, plus we run into this issue with an under-determined linear system a lot. I don't quite understand the criteria for points being included in the solver, could you explain that a bit?

From reading the slam++ paper which uses this technique, it seems the object needs to dominate the scene for a match to be found and that is definitely not the case for what we are trying, the ~0.5-1 m minimum accurate distance and wide viewing angle of the kinect v2, and the size of interestingly shaped objects I happen to have around me aren't large enough to fill half of the frame.

ahundt commented 8 years ago

Oh, it is also worth noting that we aren't getting good results with the sample data included in the repository.

tolgabirdal commented 8 years ago

Andrew,

Thank you for the feedback. These will be considered.

As I mentioned: This is a preliminary implementation. This issues are reported already.

I am not knowledgeable about your dataset, but yes the object should dominate the scene. When this is not the case, you should choose your sampling strategy accordingly. This has huge influence.

I informed OpenCV about all these issues, already, in a proposal. I will try to realize these if I find some time. Moreover, all of these issues are already improved. For an idea about how the upcoming versions would perform, see this video for example: https://www.youtube.com/watch?v=HxV9Ouy-fLM

The sample data should be alright though. We could double-check.

Cheers,

On Thu, Dec 10, 2015 at 1:09 AM, Andrew Hundt notifications@github.com wrote:

Oh, it is also worth noting that we aren't getting good results with the sample data https://github.com/Itseez/opencv_contrib/tree/master/modules/surface_matching/samples/data included in the repository.

— Reply to this email directly or view it on GitHub https://github.com/Itseez/opencv_contrib/issues/464#issuecomment-163442843 .

/tolga

ahundt commented 8 years ago

Oh I see now, when I first saw that video I thought it was with the implementation here, but I now see it is for a new paper and improved methodology. My apologies for being dense on that count and thanks for your patience.

On my first reading the paper looks to be quite good! Thanks!

tolgabirdal commented 8 years ago

Ah, you're welcome. The results here were generated by the current implementation: https://www.youtube.com/watch?v=uFnqLFznuZU

Cheers,

On Thu, Dec 10, 2015 at 1:40 AM, Andrew Hundt notifications@github.com wrote:

Oh I see now, when I first saw that video I thought it was with the implementation here, but I now see it is for a new paper and improved methodology. My apologies for being dense on that count and thanks for your patience.

On my first reading the paper looks to be quite good! Thanks!

— Reply to this email directly or view it on GitHub https://github.com/Itseez/opencv_contrib/issues/464#issuecomment-163448567 .

/tolga

ahundt commented 8 years ago

I have some new data with a much nicer tiki model in a relatively trivial scene for matching that can be found at tiki.zip, but it still encounters the same problem.

Here is my setup:

tiki_scene_close_setup

Here is a screenshot of my object mesh:

tiki_mesh00

Here is a front view of the scene from the kinect:

tiki_scene_front_view

side view:

tiki_scene_close_side_view

As you can see, this object has a very nice surface, even a hole in the middle, and stands out very clearly against an empty background. I've also verified that the mesh/point cloud looks quite nice as well on all visible sides (didn't get the bottom). Nonetheless this runs into the same issue where there algorithm is failing in the same way. I was hopeful this would be a simple enough test case that the algorithm would even work as-is, without your refined version, but unfortunately that doesn't seem to be the case.

tolgabirdal commented 8 years ago

Andrew,

Unfortunately, this scene is not good at all. The scene you have, has the normals oriented in the opposite direction as your model. This was my first remark. "Check your normals". They matter. This is the most important thing.

Moreover, compared to the entire scene, this is a quite planar object. It has low geometric variation. It can still work if the normals are corrected. I haven't checked. But it is certainly not ideal. Try Stanford's bunny or something similar. Besides these, the scene has significant missing and at the same time spurious data (This wouldn't be a problem if the surface contained geometric information but it doesn't).

I would recommend reading the algorithm to get an idea of the objects that it is good for. If you have some concerns you could e-mail me personally (tbirdal@gmail.com or tolga.birdal@tum.de) as well, since such comments might make this place less informative and more misleading. We could then post a more informative message so that everyone saves time.

All the bests,

On Sat, Dec 12, 2015 at 1:30 AM, Andrew Hundt notifications@github.com wrote:

I have some new data with a much nicer tiki model in a relatively trivial scene for matching that can be found at tiki.zip https://github.com/Itseez/opencv_contrib/files/59981/tiki.zip, but it still encounters the same problem.

Here is my setup:

[image: tiki_scene_close_setup] https://cloud.githubusercontent.com/assets/55744/11758625/6273fd00-a03b-11e5-8093-e7d688d91665.JPG

Here is a screenshot of my object mesh:

[image: tiki_mesh00] https://cloud.githubusercontent.com/assets/55744/11758629/7080c702-a03b-11e5-875b-782eb47b75aa.png

Here is a front view of the scene from the kinect:

[image: tiki_scene_front_view] https://cloud.githubusercontent.com/assets/55744/11758660/f0e34b90-a03b-11e5-9b9e-8761fb9255f8.png

side view:

[image: tiki_scene_close_side_view] https://cloud.githubusercontent.com/assets/55744/11758666/fef1bb72-a03b-11e5-8f57-6a299e34156f.png

As you can see, this object has a very nice surface, even a hole in the middle, and stands out very clearly against an empty background. I've also verified that the mesh/point cloud looks quite nice as well on all visible sides (didn't get the bottom). Nonetheless this runs into the same issue where there algorithm is failing in the same way. I was hopeful this would be a simple enough test case that the algorithm would even work as-is, without your refined version, but unfortunately that doesn't seem to be the case.

— Reply to this email directly or view it on GitHub https://github.com/Itseez/opencv_contrib/issues/464#issuecomment-164088882 .

/tolga

ahundt commented 8 years ago

Sounds reasonable, thanks!

AliaAlaaElDinAdly commented 7 years ago

Hello @tolgabirdal,

I came across this thread while I was trying to use this module with HoloLens and trying to find a furniture model in the scanned scene from HoloLens. However, I as understood from your comments above, that would not be the best approach to take?

I already tested it and had very unexpected results. But as I am doing my Masters and running out of time and came across this thread very late, I just want to know do I proceed with investigating or better try to find another approach.

Help and advice is needed and appreciated!! Thank you.