Performance advice - Githubissues

Hey, I didn't get the notification for this issue, sorry about that.

If you don't need it to be a Rips complex, you can try Alpha, which should work OK in the 5d case. For higher dimensions, it might take forever to build the filtration, but things should be pretty quick once that's done.

Another option is EdgeCollapsedRips, which may or might not work well. How long it takes to build depends a lot on the shape of the data. The situation there is similar - once the complex is built, things should be pretty fast, but it may take forever.

There is also a very cool algorithm for handling large data sets, but I haven't implemented it because I don't understand it. It can be found here. Maybe you can find a way to use it, it's a pretty old and unmaintained library.

On the downsampling: I have done something similar before. Below is a function that downsamples data such that each point's nearest neighbour is at least r away. This works well if your sample is uneven or if you don't care about very small structures in your data.

using NearestNeighbors
using Distances
using StaticArrays

function downsample(points::Vector{<:Union{NTuple{D,T},SVector{D,T}}}, r) where {D,T}
    points = shuffle(points)
    points_matrix = reshape(reinterpret(T, points), (D,length(points))) # KDTree wants points in a matrix

    keep = fill(true, length(points))
    tree = KDTree(points_matrix, Euclidean()) # for other metrics, you may want to use a BallTree instead

    for i in 1:length(points)
        k = keep[i]
        p = points[i]
        if k
            keep[inrange(tree, SVector(p), r, false)] .= false
            keep[i] = true # i-th point was removed, so we have to add it back
        end
    end
    return points[keep]
end

Hope this helps and let me know if you have any other questions!

mtsch / Ripserer.jl

Performance advice #159