ivan-pi / kdtree2

A kd-tree implementation in Fortran by Matthew B. Kennel
Other
10 stars 3 forks source link

Benchmarks #7

Open ivan-pi opened 3 years ago

ivan-pi commented 3 years ago

While the results of comparing vs Scipy are encouraging, before I go too far with the refactoring it would be smart to benchmark against other Fortran kd-trees:

and potentially also againt some well known C/C++ libraries:

ivan-pi commented 3 years ago

The base results for the kd-tree (compiled with -O3) by Matthew Kennel are:

$ ./time
 Dimension =            3
 Tree build (s):    3.41600017E-03
 Tree query (s):    1.05800014E-03
 Dimension =            8
 Tree build (s):    4.14900016E-03
 Tree query (s):    1.39079997E-02
 Dimension =           16
 Tree build (s):    5.63300028E-03
 Tree query (s):   0.218334988    

Results for fortran-kdtree:

 Dimension =            3
 Tree build (s):    1.72240008E-02
 Tree query (s):    2.98899971E-03
 Dimension =            8
 Tree build (s):    1.78660005E-02
 Tree query (s):    6.87720031E-02
 Dimension =           16
 Tree build (s):    2.24789977E-02
 Tree query (s):    1.19430590    
Driver code ```Fortran program kdtree_benchmark use kdtree_mod implicit none real(8), allocatable :: data(:,:) ! m by n real(8), allocatable :: queries(:,:) ! m by r type(kdtree_type) :: tree real :: t0, t1, t2 integer, parameter :: m(3) = [3, 8, 16] integer, parameter :: n = 10000, r = 1000 integer, allocatable :: idxs(:,:) ! nn by r integer :: i, idx allocate(idxs(1,r)) do i = 1, 3 write(*,*) "Dimension = ", m(i) allocate(data(m(i),n)) call random_number(data) allocate(queries(m(i),r)) call random_number(queries) ! Populate tree call cpu_time(t0) call tree%build(data) call cpu_time(t1) ! Query random vectors do idx = 1, r call tree%search(queries(:,idx),idxs(:,idx)) end do call cpu_time(t2) write(*,*) "Tree build (s): ", t1 - t0 write(*,*) "Tree query (s): ", t2 - t1 deallocate(data) deallocate(queries) end do end program ```
ivan-pi commented 3 years ago

Results for coretran:

$ ./kdtree_benchmark_coretran 
 Dimension =            3
 Tree build (s):    9.62999929E-03
 Tree query (s):    5.50929978E-02
 Dimension =            8
 Tree build (s):    5.86700439E-03
 Tree query (s):   0.254123986    
 Dimension =           16
 Tree build (s):    1.20519996E-02
 Tree query (s):   0.819300950    
Driver code and build instructions ```text git clone https://github.com/leonfoks/coretran.git cd coretran mkdir build && cd build cmake -DCMAKE_BUILD_TYPE=RELEASE -DBUILD_SHARED_LIBS=OFF ../src/ make all # I had to fix a few errors first cd .. gfortran -O3 -I ./include kdtree_benchmark_coretran.f90 -L ./lib/ -lcoretran -o kdtree_benchmark_coretran ``` ```Fortran program kdtree_benchmark use variableKind, only: r64 use m_KdTree, only: KdTree, KdTreeSearch implicit none real(r64), allocatable :: data(:,:) ! n by m real(r64), allocatable :: queries(:,:) ! r by m type(KdTree) :: tree type(KdTreeSearch) :: search real :: t0, t1, t2 integer, parameter :: m(3) = [3, 8, 16] integer, parameter :: n = 10000, r = 1000 integer, allocatable :: idxs(:,:) ! nn by r integer :: i, idx allocate(idxs(r,1)) do i = 1, 3 write(*,*) "Dimension = ", m(i) allocate(data(n,m(i))) call random_number(data) allocate(queries(r,m(i))) call random_number(queries) ! Populate tree call cpu_time(t0) tree = KdTree(data) call cpu_time(t1) ! Query random vectors do idx = 1, r idxs(idx,1) = search%nearest(tree,data,queries(idx,:)) end do call cpu_time(t2) write(*,*) "Tree build (s): ", t1 - t0 write(*,*) "Tree query (s): ", t2 - t1 deallocate(data) deallocate(queries) end do end program ```
ivan-pi commented 3 years ago

I've added the following codeblock (in my local copies) to make sure all trees are querying the same data set:

  call random_seed(size=s)
  allocate(seed(s))
  seed = 1234
  call random_seed(put=seed)

Results:

(Edit: I realized later, that kdtree2 was built in single precision, while the other two were using double. However, rerunning the base result in double precision did not appear to change the runtime).

(Edit2: I noticed that coretran results are slower for the trees in 3 and 8 dimensions. This might be related to the fact that coretran uses the opposite memory arrangement than the other two codes).

ivan-pi commented 3 years ago

I was able to get my FLANN bindings working again (but only for float...). To install FLANN with apt: sudo apt-get install libflann-dev Compile the Fortran executable: gfortran -O3 flann.f90 kdtree_benchmark_flann.f90 -lflann flann.f90.txt

Driver code ```fortran program kdtree_benchmark_flann use iso_c_binding use flann implicit none real(c_float), allocatable :: data(:,:) ! n by m real(c_float), allocatable :: queries(:,:) ! r by m type(FLANNParameters) :: params type(c_ptr) :: index_id = c_null_ptr real(c_float) :: speedup integer(c_int) :: ires real :: t0, t1, t2 integer(c_int), parameter :: nn = 1 integer, parameter :: m(3) = [3, 8, 16] integer, parameter :: n = 10000, r = 1000 integer(c_int), allocatable :: idxs(:,:) real(c_float), allocatable :: dists(:,:) integer :: i, s integer, allocatable :: seed(:) call random_seed(size=s) allocate(seed(s)) seed = 1234 call random_seed(put=seed) ! params comes initialized by default params%algorithm = FLANN_INDEX_KDTREE_SINGLE params%trees = 1 params%log_level = FLANN_LOG_INFO allocate(idxs(nn,r)) ! idxs = 0 allocate(dists(nn,r)) ! dists = 0 do i = 1, 3 write(*,*) "Dimension = ", m(i) ! FLANN expects a pointer to a rows x cols matrix stored in ! row-major order (one feature/data-point on each row). ! Since Fortran follows column major ordering, we must ! swap the dimensions. allocate(data(m(i),n)) call random_number(data) allocate(queries(m(i),r)) call random_number(queries) ! Populate tree call cpu_time(t0) ! call c_f_pointer(index_id,index_int) index_id = flann_build_index_float(data,n,m(i),speedup,params) call cpu_time(t1) ! Query random vectors ires = flann_find_nearest_neighbors_index_float(index_id,queries,r,idxs,dists,nn,params) call cpu_time(t2) write(*,*) "Tree build (s): ", t1 - t0 write(*,*) "Tree query (s): ", t2 - t1 ires = flann_free_index_float(index_id,params) deallocate(data) deallocate(queries) end do end program ```

Results:

$ ./a.out
 Dimension =            3
 Tree build (s):    2.71600019E-03
 Tree query (s):    1.50110004E-02
 Dimension =            8
 Tree build (s):    1.25530008E-02
 Tree query (s):    1.56069994E-02
 Dimension =           16
 Tree build (s):    9.34499875E-03
 Tree query (s):   0.197361007       

Timings are close to Fortran kdtree2 :smirk:.

dongli commented 3 years ago

Very interesting comparison! For fortran-kdtree, I haven't optimized it intensively. It looks I have more work to do. : P