CGAL / cgal

The public CGAL repository, see the README below
https://github.com/CGAL/cgal#readme
Other
4.95k stars 1.38k forks source link

Exact IO for polyhedra #6281

Closed dpapavas closed 2 years ago

dpapavas commented 2 years ago

I need to cache polyhedron geometry (Nef_polyhedron_3, Polyhedron_3 and Surface_mesh) to disk, so that I can load and reuse it later, without loss of information. As far as I can see from the output, writing a Nef_polyhedron_3 to a file in ASCII form, outputs the polyhedron using an exact representation of its coordinates etc. Reading it back, should then result in the same geometry, without loss of information. (Is this assumption correct and if so, is it something that can be relied on, or is it an internal implementation detail, subject to change?) Polyhedron_3 and Surface_mesh output, in the form of .OFF files on the other hand seems to be hardcoded to use a floating point representation, which is to be expected of course, given that this is what the OFF file format requires. Is it somehow possible though, using existing CGAL functionality, to output these objects to file in an exact form and to read them back?

GilesBathgate commented 2 years ago

@dpapavas When you say exact representation, do you mean values are stored as arbitrary-precision rationals? Why not just use a custom output format, I use the following:

polyhedron([[0,0,0],[0,0,15],[0,7853981633974483/2500000000000000,15],[0,7853981633974483/2500000000000000,0],[1/3,0,0],[1/3,0,15],[1/3,7853981633974483/2500000000000000,15],[1/3,7853981633974483/2500000000000000,0]],[[0,1,2,3],[0,4,5,1],[1,5,6,2],[0,3,7,4],[3,2,6,7],[4,7,6,5]]);
dpapavas commented 2 years ago

Yes, I meant arbitrary-precision rationals. (Well, whatever exact number representation is used by the kernel, but arbitrary-precision rationals are the only representation I've seen in the context of polyhedrons. That's not the case for polygons, for instance, but that's another matter.)

For Polyhedron_3 and Surface_mesh, rolling my own format was the plan, since it's relatively simple; I just wanted to make sure that there wasn't standard functionality available for the purpose. It woud be harder to do for Nef_polyhedron_3, but luckily, this seems to be written in exact form, i.e. without coercion to floating point in the proprietary format. I just wanted to make sure that's the case. That is, I wanted to make sure that writing a Nef_polyhedron_3 and reading it again, is guaranteed to result in the same geometry, without loss of information.

GilesBathgate commented 2 years ago

The code is quite simple, it just converts the Surface_mesh, or Polyhedron_3 to nef: https://github.com/CGAL/cgal/blob/master/Nef_3/include/CGAL/Nef_3/polygon_mesh_to_nef_3.h no loss of precision, since the FT is from the same Kernel.

I use this code to go the other way: https://github.com/GilesBathgate/RapCAD/blob/master/src/cgalexport.cpp#L297 which produces the above output (with rational output preference set)

GilesBathgate commented 2 years ago

I should add that my implementation isn't quite right as it doesn't yet triangulate facets with holes, so it probably wouldn't be suitable for your needs.

dpapavas commented 2 years ago

Sorry, I wasn't clear enough. Consider this code snippet:

Polyhedron_3 P;

// Build the polyehdron somehow.
...

// Write it to disk.

{
    std::ofstream f("cached_polyhedron");
    f << P;
}

...

Polyhedron_3 Q;

// Read it back from disk.

{
    std::ifstream f("cached_polyhedron");
    f >> Q;
}

assert (P == Q);

The assertion will, in general fail, because the original polyhedron P, will be written as an OFF file, using floating point coordinates, so when it's read back, any points which had coordinates that didn't happen to be exactly representable as floating point number, will be slightly different in Q.

As I said, I'd like to avoid that, which I can do, by using a custom output format, with all coordinates represented as rationals. What I wanted to make sure, is that the above assertion is guaranteed to succeed for Nef_polyhedron_3.

Thanks for pointing me to your code! I will definitely study it for guidance, even if it isn't entirely applicable to my needs.

GilesBathgate commented 2 years ago

Yeah, so ".nef" output format (as used by my project and openscad) uses https://github.com/CGAL/cgal/blob/master/Nef_3/include/CGAL/IO/Nef_polyhedron_iostream_3.h, internally this is using the following to read points: https://github.com/CGAL/cgal/blob/c55273565107d366e3cd325c39f4bf5e27ec10bb/Nef_3/include/CGAL/Nef_3/SNC_io_parser.h#L673-L689

So I think you are good with that format.

GilesBathgate commented 2 years ago

I should add that it uses a homogeneous representation in disk. https://en.wikipedia.org/wiki/Homogeneous_coordinates

GilesBathgate commented 2 years ago

@dpapavas Oh your issue is that you don't want to convert to Nef_3 before writing out to disk, and want to go direct from Polyhedron_3 or Surface_mesh_3 to an exact representation on disk.

GilesBathgate commented 2 years ago

@dpapavas untested code. Does it help:

using VertexIterator = CGAL::Polyhedron3::Vertex_const_iterator;
using FacetIterator = CGAL::Polyhedron3::Facet_const_iterator;
using HalffacetCirculator = CGAL::Polyhedron3::Halfedge_around_facet_const_circulator;
mpq_srcptr to_mpq(const CGAL::FT& d);
static std::list<CGAL::Triangle3> generateTriangles(CGAL::Polyhedron3* poly);

void exportCache(CGAL::Polyhedron3* poly) {

    int vertexCount=0;
    std::map<CGAL::Point3,int> vertex_map;
    for(VertexIterator vi=poly->vertices_begin(); vi!=poly->vertices_end(); ++vi) {
        CGAL::Point3 pt=vi->point();

        char* x,y,z;
        gmp_asprintf(&x,"%Qd",to_mpq(pt.x()));
        gmp_asprintf(&y,"%Qd",to_mpq(pt.y()));
        gmp_asprintf(&z,"%Qd",to_mpq(pt.z()));
        // write to file

        vertex_map.insert(pt,vertexCount++);
    }

    for(const auto& t: generateTriangles(poly)) {
        int v1=vertex_map[t[0]];
        int v2=vertex_map[t[1]];
        int v3=vertex_map[t[2]];
        /// write to file
    }
}

static std::list<CGAL::Triangle3> generateTriangles(CGAL::Polyhedron3* poly)
{
    std::list<CGAL::Triangle3> triangles;
    for(FacetIterator fi=poly->facets_begin(); fi!=poly->facets_end(); ++fi) {
        HalffacetCirculator hc=fi->facet_begin();
        CGAL_assertion(circulator_size(hc)==3);
        CGAL::Point3 p1=(hc++)->vertex()->point();
        CGAL::Point3 p2=(hc++)->vertex()->point();
        CGAL::Point3 p3=(hc++)->vertex()->point();
        CGAL::Triangle3 t(p1,p2,p3);
        triangles.append(t);
    }
    return triangles;
}

mpq_srcptr to_mpq(const CGAL::FT& d)
{
#ifdef CGAL_USE_BOOST_MP
    return d.exact().backend().data();
#else
#ifdef CGAL_USE_GMPXX
    return d.exact().get_mpq_t();
#else
    return d.exact().mpq();
#endif
#endif
}
dpapavas commented 2 years ago

I need to be able to write all 3 kinds of polyhedron objects to disk and be able to reload them and end up with the original object, at least in terms of geometry. My application is similar to yours in fact. Since design tends to be an iterative process, I want to implement a caching mechanism, where all (or some) intermediate results are written to disk and can be simply reloaded, avoiding recomputation from scratch.

I currently have this implemented using OFF-based I/O for serialization of Polyhedron_3 and Surface_mesh objects. It works most of the time, but sometimes I make a change and when recomputing based on cached intermediate results, I get a CGAL exception, such as:

CGAL assertion 'oxz_pqr != COLLINEAR'

I haven't investigated it in detail, but I bet it's due to the conversion to floating point and back of the cached geometry, resulting in some formerly collinear points, being no longer so. Running the same computation from scratch, i.e. without reloading the saved intermediate results, works fine. (Computations based on cached Nef polyhedra, seem to work fine so far.)

Thanks for all your help. Your code seems to be similar to what I had in mind. I'm going to implement something like this and give it a go. I mostly opened the issue to make sure that I haven't overlooked existing functionality for this purpose, since it seemed like something that would be of general utility and to make sure I can use the proprietary CGAL Nef I/O format for my purpose.

GilesBathgate commented 2 years ago

@dpapavas I'm not aware of anything, but I am not a CGAL dev. IO is mostly for exporting to other applications in CGAL as far as I can tell, and most other applications use float. The Nef IO does seem to be the exception. You could just convert to (and from) Nef, its constructor takes either a Polyhedron_3 or Surface_mesh, and there are functions to convert back.

GilesBathgate commented 2 years ago

@dpapavas By the way, both Polyhedron_3 and Surface_mesh_3 implement the MutableFaceGraph concept, so you can write generic code based around that to save having to write it twice. https://doc.cgal.org/latest/BGL/classMutableFaceGraph.html

dpapavas commented 2 years ago

Conversion to and from Nef is very slow (relatively speaking), but I had a common importer/exporter for both Polyhedron_3 and Surface_mesh in mind, as you suggest, based on Generic_facegraph_builder and Generic_facegraph_printer, but wihtout all the coercions to doubles, of course.

GilesBathgate commented 2 years ago

@dpapavas Boolean operations on Nef are slow. But conversion isn't "very" slow. Also, the speed is much improved with: https://github.com/CGAL/cgal/pull/5412 and https://github.com/CGAL/cgal/pull/5455 . Either way, as you didn't mention speed as an issue I thought you wanted something easy to implement.

If you want to convert from Surface_mesh, and Polyhedron_3 direct to disk, you can probably do it faster as those data structures don't require the calculation of facet normals. Also, mpq_set_str gmp_asprintf can help you read/write the rationals from/to text, or there is probably an operator<< for CGAL::FT as in the SNC_io_parser code you can use.

sloriot commented 2 years ago

So I get it right, you are using a kernel with an exact rational number type. In such case, the simplest solution for Polyhedron/Surface_mesh is to convert the mesh to a soup polygon_mesh_to_polygon_soup(), with the containers and read them back and use polygon_soup_to_polygon_mesh(). I don't think there is any existing standard format storing exact rational.

sloriot commented 2 years ago

If you are using EPECK, you have to put CGAL::exact(pt) and not pt in your stream otherwise a double approximation will be written.

GilesBathgate commented 2 years ago

@sloriot Presumably polygon_mesh_to_polygon_soup(), and polygon_soup_to_polygon_mesh() will repair errors that @dpapavas is seeing due to loss of precision? is that process deterministic?

dpapavas commented 2 years ago

Thanks for confirming that I haven't missed something @sloriot and also for the pointers. Regarding I/O on Nef polyhedra, using the proprietary format via Nef_polyhedron_iostream_3.h, am I correct in assuming that this stores the Nef's geometry using an exact representation so that writing a Nef to a file and re-reading it, will result in the original Nef (or one that is functionally indistinguishable) without loss of information?

sloriot commented 2 years ago

Thanks for confirming that I haven't missed something @sloriot and also for the pointers. Regarding I/O on Nef polyhedra, using the proprietary format via Nef_polyhedron_iostream_3.h, am I correct in assuming that this stores the Nef's geometry using an exact representation so that writing a Nef to a file and re-reading it, will result in the original Nef (or one that is functionally indistinguishable) without loss of information?

Yes, the only thing I don't remember is what if EPECK is used (that is, is the call to exact() done internally).

dpapavas commented 2 years ago

@GilesBathgate I haven't made any careful measurements yet, but my impression so far is that conversion to Nef, while not as slow as some other operations (hence the "relatively speaking") is, nevertheless slow. Consider for instance the simple operation of calculating the difference of a sphere of radius 2 minus a sphere of radius 1 (i.e. a hollow sphere). This is output of my application using Nef polyhedra:

$0 = sphere(1,1/1000) (stored: 43474ded6e9b1923.o, facets: 5120, halfedges: 15360, vertices: 2562, in: 0.059s)
$1 = sphere(2,1/1000) (stored: 2fafd0c2b114361e.o, facets: 20480, halfedges: 61440, vertices: 10242, in: 0.24s)
$2 = nef($0) (stored: 2e3792d94ca72a57.o, facets: 5120, volumes: 2, halffacets: 10240, edges: 7680, halfedges: 15360, vertices: 2562, in: 1.8s)
$3 = nef($1) (stored: 67fd1946108ef6bb.o, facets: 20480, volumes: 2, halffacets: 40960, edges: 30720, halfedges: 61440, vertices: 10242, in: 7.1s)
$4 = difference($3,$2) (stored: a91ed13a6d9c0d2f.o, facets: 25600, volumes: 3, halffacets: 51200, edges: 38400, halfedges: 76800, vertices: 12804, in: 45s)
$5 = mesh($4) (stored: 7fe366ec9e9c7c16.o, facets: 25600, edges: 38400, halfedges: 76800, vertices: 12804, in: 1s)

It creates the spheres using CGAL's subdivision functionality aiming for some specified maximum error, so that the large sphere has a lot more facets than the smaller one. While the difference operation takes ~80% of the total calculation time, conversion to Nef of the large sphere does take a non-negligible amount of time (~7s). The final step, which is conversion of the difference result from Nef to surface mesh, (which happens for a reason that is unimportant for the purposes of the present discussion) also takes a relatively significant amount of time (relative to the subdivision operations for instance).

Compare with the same operation carried out using corefinement:

$0 = sphere(1,1/1000) (stored: 43474ded6e9b1923.o, facets: 5120, halfedges: 15360, vertices: 2562, in: 0.059s)
$1 = sphere(2,1/1000) (stored: 2fafd0c2b114361e.o, facets: 20480, halfedges: 61440, vertices: 10242, in: 0.24s)
$2 = difference($1,$0) (stored: 3c3c31efdd8df4b3.o, facets: 25600, halfedges: 76800, vertices: 12804, in: 3s)
$3 = mesh($2) (stored: c3d7f7fbcd79d5ae.o, facets: 25600, edges: 38400, halfedges: 76800, vertices: 12804, in: 0.17s)

Again, most of the speedup comes from the relative speeds of the boolean operations themselves, but still, the conversion to Nef of the large sphere alone takes about twice as long as the whole corefinement-based construction.

These measurements are based on unoptimized (-O0 -g etc.) code and I haven't verified them, so take them with a pinch of salt.

GilesBathgate commented 2 years ago

@dpapavas Like I said, Nef needs to calculate facet normals, if your surface mesh is all triangles its at least not using newells method now. I think during construction Nef also does a simplify() step. Actually, another optimisation was #5470 where previously it seemed to be doing the simplification process twice (accidentally we think)

So yeah if your ultimate goal is to avoid Nef, and do PMP corefine, you should go down the direct surface_mesh to disk cache route as already discussed.

Alternatively, I am looking at further optimising Nef_3, because it creates cleaner geometry and handles incomplete meshes such as shells, surfaces, and ariels. (in addition to bounded volumes)

dpapavas commented 2 years ago

Yes, the only thing I don't remember is what if EPECK is used (that is, is the call to exact() done internally).

The call to exact() is not done, as far as I can see, but the whole point gets converted (from EPECK to a Simple_cartesian kernel in my case) via get_point and Type_converter below.

https://github.com/CGAL/cgal/blob/c55273565107d366e3cd325c39f4bf5e27ec10bb/Nef_3/include/CGAL/Nef_3/SNC_io_parser.h#L95-L110

The point is then output in homegeneous form as @GilesBathgate already mentioned. So I'll asume that I can use this format for my purposes and implement a custom aribrary-precision format for Polyhedron_3 and Surface_mesh objects. Many thanks to everyone for their help!

GilesBathgate commented 2 years ago

@dpapavas Yes I think homogeneous should be more compact in ascii because there is only one common denominator for the x,y,z values of the point ie hx,hy,hz,hw., of course a binary format would be better if disk space is an issue.

afabri commented 2 years ago

Note that you geometrically get the same thing but with an Exact_predicates_exact_constructions_kernel it will cost some time to compute exact rationals because internally it stores objects as approximation with interval arithmetic plus the construction graph. The latter will be executed when calling exact().

For the Surface_mesh I am wondering if we should offer a serialize à la boost. I don't feel comfortable to store quotients in a .off file as a file extension somehow is a promise for what you can expect in the file. And as a Surface_mesh is essentially a bunch of arrays storing int for indices this should basically some lines of code.

GilesBathgate commented 2 years ago

Note that you geometrically get the same thing but with an Exact_predicates_exact_constructions_kernel it will cost some time to compute exact rationals because internally it stores objects as approximation with interval arithmetic plus the construction graph.

I wonder if you could instead write Interval_nt to disk?

afabri commented 2 years ago

That is not enough as you loose too much information. Imagine the Nef polyhedron was just a point obtained as intersection of two segments and the interval non-trivial. Then you would have lost where the point really is in space.

GilesBathgate commented 2 years ago

Then you would have lost where the point really is in space.

Oh you would have to also store the entire stack of operations on the Lazy_exact_nt 😖

dpapavas commented 2 years ago

Note that you geometrically get the same thing but with an Exact_predicates_exact_constructions_kernel it will cost some time to compute exact rationals because internally it stores objects as approximation with interval arithmetic plus the construction graph. The latter will be executed when calling exact().

My understanding here, is that it will cost some of the extra time that was saved in not computing the exact results to begin with, which is the general idea of using interval arithmetic in this case. That is, using, using a filtered kernel, such as EPECK, is still a good idea performance-wise, only when serializing to disk you inevitably lose some of the benefit, because you'll need to compute the exact values in order to store them. Am I correct?

For the Surface_mesh I am wondering if we should offer a serialize à la boost. I don't feel comfortable to store quotients in a .off file as a file extension somehow is a promise for what you can expect in the file. And as a Surface_mesh is essentially a bunch of arrays storing int for indices this should basically some lines of code.

When you say Surface_mesh, am I correct to assume you mean Polyhedron_3 as well? They seem to share the same graph-base implementation. I agree it would be useful (as you might imagine) and I wouldn't mind attempting an implementation, if you can't spare the time on your side, but one complication might be how/whether to support per-vertex attributes, such as colors and the like. (I don't need these, currently at least.)