Closed rmflight closed 2 years ago
Never mind. Did better timing with timeit, and it turns out icikt is only marginally slower.
import timeit
setup_code = '''
import icikt
import scipy.stats
import numpy as np
largeArray = np.random.randn(50000, 2)'''
icikt_code = '''
corr, pval, tmax = icikt.icikt(largeArray[:, 0], largeArray[:, 1])'''
print (timeit.timeit(setup = setup_code, stmt = icikt_code, number = 500) / 500)
# 0.012456782705965452
scipy_code = '''
corr, pval = scipy.stats.kendalltau(largeArray[:, 0], largeArray[:, 1])'''
print(timeit.timeit(setup = setup_code, stmt = scipy_code, number = 500) / 500)
# 0.01194520887604449
So they are actually pretty much the same.
May need to profile and cythonize other parts of icikt.
I disagree. Look at those numbers. They diverge at the 3rd decimal barely, and that's for a 50000 entry array, too.
However, @hunter-moseley opened, I'll let them close it.
Just to compare, I ran the R package base ici_kt function (the Rcpp basic comparison), for 50000 entries, running 5 times, I get a time of 0.02 seconds. So I don't think we can do a whole lot better than what it already is.
Just to update, I switched the tests in the actions workflow to use timeit, and you can see in all three actions for different versions of python, the execution times are almost identical.
Here is one of them: https://github.com/MoseleyBioinformaticsLab/icikt/runs/4433921210?check_suite_focus=true#step:7:50
I'm closing this because I don't see any advantage to profiling it now.
Currently
icikt
gives a run time of ~ 0.06 s on two slices of an array.When I run those same two slices through
scipy.stats.kendalltau
, I get a run time of 0.006 s, or about 10X faster.Either something needs to be twigged in the compile options for the _kendall_dis.pyx, or something weird is going on.