Performance of iterating with `delaunayedges`

JuliaGeometry / VoronoiDelaunay.jl

Fast and robust Voronoi & Delaunay tessellation creation with Julia

Other

123 stars 26 forks source link

Performance of iterating with `delaunayedges` #47

Closed fgasdia closed 2 years ago

fgasdia commented 5 years ago

Iterating over delaunayedges is a few hundred times slower and makes significant allocations compared to directly iterating over a DelaunayTessellation2D for triangles.

Example:

tess = DelaunayTessellation(100)
width = max_coord - min_coord
a = Point2D[Point(min_coord + rand() * width, min_coord + rand() * width) for i in 1:100]
push!(tess, a)

function triangleiter(tess)
    i = 0
    for t in tess
        i += 1
    end
end

function edgeiter(tess)
    i = 0
    for e in delaunayedges(tess)
        i += 1
    end
end

@btime edgeiter($tess)
  681.877 μs (1142 allocations: 42.59 KiB)

@btime triangleiter($tess)
  1.225 μs (1 allocation: 16 bytes)

Any idea on how to make this faster? When I have thousands of points the delaunayedges iteration time is significantly large for my application.

skariel commented 5 years ago

I'll take a look into it...

skariel commented 5 years ago

ok. Some results. The function delaunayedges creates another function (a task in fact) and you are timing its compilation time. So doing the following brings down the timing 20x:

function edgeiter(it)
    i = 0
    for e in it
        i += 1
    end
end

it = delaunayedges(tess)
@btime edgeiter($it)
  27.554 μs (0 allocations: 0 bytes)

That being said, this solution is not good for everybody and the performance is not good enough. The correct solution is to write a proper iterator. I'll do this, it will take a few days though...

EDIT: a typo (was tess instead of it in the @btime)

skariel commented 5 years ago

I fixed the example above...

fgasdia commented 5 years ago

Thanks for looking into this. I'll appreciate the proper iterator when you have time for it.

CarpeNecopinum commented 5 years ago

As a temporary fix, we could also use an eagerly allocating variant like:

function delaunayedges_fast(t::DelaunayTessellation2D)
    result = DelaunayEdge[]
    for ix in 2:t._last_trig_index
        tr = t._trigs[ix]
        isexternal(tr) && continue

        ix_na = tr._neighbour_a
        if (ix_na > ix) || isexternal(t._trigs[ix_na])
            push!(result, DelaunayEdge(getb(tr), getc(tr)))
        end
        ix_nb = tr._neighbour_b
        if (ix_nb > ix) || isexternal(t._trigs[ix_nb])
            push!(result, DelaunayEdge(geta(tr), getc(tr)))
        end
        ix_nc = tr._neighbour_c
        if (ix_nc > ix) || isexternal(t._trigs[ix_nc])
            push!(result, DelaunayEdge(geta(tr), getb(tr)))
        end
    end
    result
end

Tested on my dataset (2.7M tris) it caused a ~10x speed-up over the Channel-based variant.

I also eliminated the need for the visited array (reduces the runtime by a few percent still), which should make it easier to create a proper iterator for this.