janba / GEL

A library of geometry processing tools for computer graphics
94 stars 23 forks source link

Vectorized gel.MeshDistance.signed_distance to query many points #45

Closed martinResearch closed 3 years ago

martinResearch commented 3 years ago

Hi, Thanks you for the great work. I would like to query the distance to a simple mesh from many points but is too slow for the the use I have.

I did some timing measurement using this code:

import time

from PyGEL3D import gel
import numpy as np
import scipy.spatial

for name in ["bunny", "tetra"]:
    filename = f"D:\\{name}.obj"
    # filename = r"D:\bunny.obj"
    mesh = gel.obj_load(filename)
    mdist = gel.MeshDistance(mesh)
    print(f"\nresults in using {name} with {len(mesh.vertices())} vertices")
    nb_points_list = [1000, 10000, 100000, 500000]
    durations_list = []

    positions = mesh.positions()
    kdtree = scipy.spatial.cKDTree(positions)

    # nearest vertex using kdtree
    for nb_points in nb_points_list:
        query_points = np.random.rand(nb_points, 3)

    # nearest surface point using PyGEL3D
    for nb_points in nb_points_list:
        query_points = np.random.rand(nb_points, 3)
        print(f"{nb_points:06d} points:")
        # nearest vertex kdtree force
        start = time.time()
        distances_kdtree = kdtree.query(query_points)[0]
        duration = time.time() - start
        print(f"  kdtree.query           took {duration:0.5g} seconds")

        # nearest point on surface using PyGEL3D
        start = time.time()
        distances_pygl3d = np.array(
            [mdist.signed_distance(query_point) for query_point in query_points]
        )
        duration = time.time() - start
        print(f"  mdist.signed_distance  took {duration:0.5g} seconds")

        # nearest vertex brute force
        if query_points.shape[0]*positions.shape[0] < 1000000000:
            start = time.time()
            distances_brute = np.sqrt(
                np.min(np.sum((query_points[None, :, :] - positions[:, None, :])**2, axis=2), axis=0)
            )
            duration = time.time() - start
            durations_list.append(duration)
            print(f"  brute force            took {duration:0.5g} seconds")

I get

results in using bunny with 34834 vertices
001000 points:
  kdtree.query           took 0.03048 seconds
  mdist.signed_distance  took 0.070649 seconds
  brute force            took 1.2068 seconds
010000 points:
  kdtree.query           took 0.29087 seconds
  mdist.signed_distance  took 0.61576 seconds
  brute force            took 18.007 seconds
100000 points:
  kdtree.query           took 2.9475 seconds
  mdist.signed_distance  took 6.0059 seconds
500000 points:
  kdtree.query           took 14.715 seconds
  mdist.signed_distance  took 30.229 seconds

results in using tetre with 4 vertices
001000 points:
  kdtree.query           took 0.001929 seconds
  mdist.signed_distance  took 0.031801 seconds
  brute force            took 0.0010002 seconds
010000 points:
  kdtree.query           took 0.0030487 seconds
  mdist.signed_distance  took 0.15002 seconds
  brute force            took 0.0060039 seconds
100000 points:
  kdtree.query           took 0.034344 seconds
  mdist.signed_distance  took 1.3283 seconds
  brute force            took 0.032322 seconds
500000 points:
  kdtree.query           took 0.18276 seconds
  mdist.signed_distance  took 6.9019 seconds
  brute force            took 0.067097 seconds

we see that for the bunny mesh (with 34834 vertices) gel.MeshDistance.signed_distance is about twice slower than querying the nearest vertex using scipy's scipy.spatial.cKDTree.query, which seems reasonable. However using a mesh with much fewer vertices tetra.obj that has only 4 vertices, using signed_distance becomes about 40 times slower than using the kdtree and about 100 slower than a brute force nearest vertex search. I suspect that having to do the loop over the points from python induces significant overhead. Maybe having a vectorized signed_distance method that can take a list of point as a numpy array and returns a numpy array would yield a large acceleration. Does that seem easy to implement ?

janba commented 3 years ago

Hi Martin,

Thanks for the interest and the timings!

You are quite right. It would likely be more efficient to do as you suggest, and the change is also not difficult, so I applied it but to the branch I am currently working on:

https://github.com/janba/GEL/tree/gel_with_graphs

I do not expect the main branch to change, so it should be safe to switch if you want to try it out.

Very quick test:

from PyGEL3D import gel

m2 = gel.obj_load("/Users/andreas/GEL/data/cornell_box/small_sphere.obj") gel.bbox(m2) (array([-0.01, -0.01, -0.01]), array([0.01, 0.01, 0.01])) dst2 = gel.MeshDistance(m2) dst2.ray_inside_test([0,0,0,2,2,2,.0,.0,.009,100,0,0]) array([1, 0, 1, 0], dtype=int32) dst2.signed_distance([0,0,0,2,2,2,.0,.0,.009,100,0,0]) array([-8.628489e-03, 3.454246e+00, -8.628491e-04, 9.999000e+01], dtype=float32)

Best Andreas

On 9 Jan 2021, at 17.32, Martin de La Gorce notifications@github.com<mailto:notifications@github.com> wrote:

Hi, Thanks you for the great work. I would like to query the distance to a simple mesh from many points but is too slow for the the use I have.

I did some timing measurement using this code:

import time

from PyGEL3D import gel import numpy as np import scipy.spatial

for name in ["bunny", "tetra"]: filename = f"D:\{name}.obj"

filename = r"D:\bunny.obj"

mesh = gel.obj_load(filename)
mdist = gel.MeshDistance(mesh)
print(f"\nresults in using {name} with {len(mesh.vertices())} vertices")
nb_points_list = [1000, 10000, 100000, 500000]
durations_list = []

positions = mesh.positions()
kdtree = scipy.spatial.cKDTree(positions)

# nearest vertex using kdtree
for nb_points in nb_points_list:
    query_points = np.random.rand(nb_points, 3)

# nearest surface point using PyGEL3D
for nb_points in nb_points_list:
    query_points = np.random.rand(nb_points, 3)
    print(f"{nb_points:06d} points:")
    # nearest vertex kdtree force
    start = time.time()
    distances_kdtree = kdtree.query(query_points)[0]
    duration = time.time() - start
    print(f"  kdtree.query           took {duration:0.5g} seconds")

    # nearest point on surface using PyGEL3D
    start = time.time()
    distances_pygl3d = np.array(
        [mdist.signed_distance(query_point) for query_point in query_points]
    )
    duration = time.time() - start
    print(f"  mdist.signed_distance  took {duration:0.5g} seconds")

    # nearest vertex brute force
    if query_points.shape[0]*positions.shape[0] < 1000000000:
        start = time.time()
        distances_brute = np.sqrt(
            np.min(np.sum((query_points[None, :, :] - positions[:, None, :])**2, axis=2), axis=0)
        )
        duration = time.time() - start
        durations_list.append(duration)
        print(f"  brute force            took {duration:0.5g} seconds")

I get

results in using bunny with 34834 vertices 001000 points: kdtree.query took 0.03048 seconds mdist.signed_distance took 0.070649 seconds brute force took 1.2068 seconds 010000 points: kdtree.query took 0.29087 seconds mdist.signed_distance took 0.61576 seconds brute force took 18.007 seconds 100000 points: kdtree.query took 2.9475 seconds mdist.signed_distance took 6.0059 seconds 500000 points: kdtree.query took 14.715 seconds mdist.signed_distance took 30.229 seconds

results in using tetre with 4 vertices 001000 points: kdtree.query took 0.001929 seconds mdist.signed_distance took 0.031801 seconds brute force took 0.0010002 seconds 010000 points: kdtree.query took 0.0030487 seconds mdist.signed_distance took 0.15002 seconds brute force took 0.0060039 seconds 100000 points: kdtree.query took 0.034344 seconds mdist.signed_distance took 1.3283 seconds brute force took 0.032322 seconds 500000 points: kdtree.query took 0.18276 seconds mdist.signed_distance took 6.9019 seconds brute force took 0.067097 seconds

we see that for the bunny meshhttps://github.com/janba/GEL/blob/master/data/bunny.obj (with 34834 vertices) gel.MeshDistance.signed_distance is about twice slower than querying the nearest vertex using scipy's scipy.spatial.cKDTree.query, which seems reasonable. However using a mesh with much fewer vertices tetra.objhttps://github.com/janba/GEL/blob/master/data/tetra.obj that has only 4 vertices, using signed_distance becomes about 40 times slower than using the kdtree and about 100 slower than a brute force nearest vertex search. I suspect that having to do the loop over the points from python induces significant overhead. Maybe having a vectorized signed_distance method that can take a list of point as a numpy array and returns a numpy array would yield a large acceleration. Does that seem easy to implement ?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/janba/GEL/issues/45, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAXE6QZOMJPEM2M4EMJLHXLSZCAJ7ANCNFSM4V3WODNA.

martinResearch commented 3 years ago

great , thank you!