CGAL / cgal

The public CGAL repository, see the README below
https://github.com/CGAL/cgal#readme
Other
4.87k stars 1.38k forks source link

Mesh_3: bug in perturber #1944

Closed lrineau closed 6 years ago

lrineau commented 7 years ago

While debugging something else, a ctest run showed strange failures in another test...

In master (currently 2249dfde063b2d359a93f4e46c9e5ca06789c5f3), if I add the following patch:

diff --git a/Mesh_3/include/CGAL/Mesh_3/C3T3_helpers.h b/Mesh_3/include/CGAL/Mesh_3/C3T3_helpers.h
index 6b3f73f..2ef57f8 100644
--- a/Mesh_3/include/CGAL/Mesh_3/C3T3_helpers.h
+++ b/Mesh_3/include/CGAL/Mesh_3/C3T3_helpers.h
@@ -3374,6 +3374,8 @@ get_least_square_surface_plane(const Vertex_handle& v,

   reference_point = surface_point_vector.front();

+  CGAL_assertion(is_valid(plane.d()));
+
   return plane;
 }

diff --git a/Mesh_3/test/Mesh_3/debug.h b/Mesh_3/test/Mesh_3/debug.h
index 886fa14..30172eb 100644
--- a/Mesh_3/test/Mesh_3/debug.h
+++ b/Mesh_3/test/Mesh_3/debug.h
@@ -17,7 +17,7 @@
 // #define CGAL_MESHES_DEBUG_REFINEMENT_POINTS
 // #define CGAL_DEBUG_OCTREE_CONSTRUCTION

-//#define CGAL_MESH_3_VERBOSE
+#define CGAL_MESH_3_VERBOSE
 //#define CGAL_MESH_3_IO_VERBOSE

 // #define OPTIMIZE_INTERSECTION

then the test test/Mesh_3/test_mesh_implicit_domains.cpp fails. That shows that, for some reason, there are NaN numbers computed in the get_least_square_surface_plane function. I found out that there are "surface centers" with NaN numbers too. Probably some duals are degenerate in floating points, and should be computed robustly.

I failed to reproduce the bug with CGAL-4.9. I do not know why.

lrineau commented 7 years ago

Update: the failing test file is test_mesh_implicit_domains.cpp.

janetournois commented 7 years ago

I cannot reproduce the bug, even though I disabled the time limits on perturb_mesh_3 and exude_mesh_3...

lrineau commented 7 years ago

I reassign this issue to 4.11.

Once we have time, we should try and dig further in the code of the perturber, to find out why NaNs occur the computation.

lrineau commented 7 years ago

Update: Here is the new patch:

diff --git a/Mesh_3/include/CGAL/Mesh_3/C3T3_helpers.h b/Mesh_3/include/CGAL/Mesh_3/C3T3_helpers.h
index 6b3f73f..41e4c06 100644
--- a/Mesh_3/include/CGAL/Mesh_3/C3T3_helpers.h
+++ b/Mesh_3/include/CGAL/Mesh_3/C3T3_helpers.h
@@ -3354,6 +3354,8 @@ get_least_square_surface_plane(const Vertex_handle& v,
       const int& i = fit->second;

       surface_point_vector.push_back(cell->get_facet_surface_center(i));
+      CGAL_assertion_code(const Point_3& p = surface_point_vector.back();)
+      CGAL_assertion(is_valid(p.x()) && is_valid(p.y()) && is_valid(p.z()));
     }
   }

and TBB and C++11 must be involved, for the bug to show up. I can reproduce it more than 90% of time with Mesh_3/test/Mesh_3/test_meshing_3D_image.cpp.

For some reason, I have never seen that bug with a polyhedral domain, only with an implicit one, or with a 3D image, segmented or not. Maybe the fact that the domain oracle is faster increase the change of a race-condition.

lrineau commented 7 years ago

Cc @cjamin

lrineau commented 7 years ago

Actually... I can see the bug even without TBB and the Parallel_tag:

renoir ~/Git/cgal/Mesh_3/test/Mesh_3 $ ../../../build/test/Mesh_3/test_meshing_3D_image
Mesh generation from a 3D image:
        Seed is 1496935255
        Volume is 1736359.78726467
        Number of cells: 3085
        Number of facets: 1291
        Number of vertices: 806
Refining again...
        Number of cells: 3085
        Number of facets: 1291
        Number of vertices: 806
Exude...
        Number of cells: 2938
        Number of facets: 1291
        Number of vertices: 806
        Quality before optimization: 0.268955 - Quality after optimization: 4.8985
        Volume is 1736359.78726467
Perturb...
terminate called after throwing an instance of 'CGAL::Assertion_exception'
  what():  CGAL ERROR: assertion violation!
Expr: is_valid(p.x()) && is_valid(p.y()) && is_valid(p.z())
File: /home/lrineau/Git/cgal/Mesh_3/test/Mesh_3/../../include/CGAL/Mesh_3/C3T3_helpers.h
Line: 3358

The assertion about coordinates being invalid (NaNs) is triggered before the parallel mode is tested.

lrineau commented 7 years ago

@janetournois in the previous message you can see the seed:

Seed is 1496935255

But it seems the issue is barely dependent on the seed, on my Linux machine. My last run was using the seed:

Seed is 1497517282 and it failed the same way.

lrineau commented 7 years ago

After the patch lrineau/cgal@4ddd3f6, I can no longer reproduce the bug without the compiler optimizers.

Actually, before that patch, I do not know if the optimizers were required.

janetournois commented 7 years ago

that at least explains why it is not reproducible on msvc...

lrineau commented 6 years ago

Jane, as I said in my first message in this issue

I failed to reproduce the bug with CGAL-4.9. I do not know why.

So maybe the bug was introduced between 4.9 and 4.10... Maybe related to weighted points?