Open privong opened 4 years ago
Line-by-line profiling of the Kendall's tau code indicates that approx 80% of the run time (for Nboot=1000
) occurs with the for
loops beginning here: https://github.com/privong/pymccorrelation/blob/bb0e31360f8479f9d110b297db227cae90581f76/pymccorrelation/pymccorrelation.py#L129
and continuing through 149.
It might be possible to create a rank matrix using numpy vectorizing operations (w/ broadcasting).
For example a 2D rank matrix can be set up this way:
def m_comp(a, b):
res = np.zeros((len(a), len(b)))
for i in range(len(a)):
for j in range(len(b)):
res[i, j] = b[j] > a[i]
return res
Or it can be vectorized with:
b[:] > a[:,np.newaxis]
The latter is 10x faster for a
and b
arrays that are 5 elements long and when N~1000, the vectorized version is about 600x faster
Perhaps with some indexing, something like the latter could replace much of the for loops in the code?
Failing a successful implementation of that (or in addition to), the bootstraps are independent and could be easily parallelized.
For datasets of size of around 500 points and Nboot~100
numpy tries to allocate around 30 GB of memory for a vectorized method that is currently in testing. So vectorizing alone may not be a sufficient approach. The vectorized job could be split up and executed via multiprocessing.Pool.imap()
(with chunksize
set) to take advantages of both vectorization and multiple cores.
The implementation of Kendall's tau method for censored data is slow. This is likely due in part to the nested
for
loops. The code should be looked at to see if it can be sped up (by vectorization or other approaches).