privong / pymccorrelation

Correlation coefficients with uncertainties
GNU General Public License v3.0
10 stars 3 forks source link

Speed-up Kendall's tau w/ censoring #8

Open privong opened 4 years ago

privong commented 4 years ago

The implementation of Kendall's tau method for censored data is slow. This is likely due in part to the nested for loops. The code should be looked at to see if it can be sped up (by vectorization or other approaches).

privong commented 4 years ago

Line-by-line profiling of the Kendall's tau code indicates that approx 80% of the run time (for Nboot=1000) occurs with the for loops beginning here: https://github.com/privong/pymccorrelation/blob/bb0e31360f8479f9d110b297db227cae90581f76/pymccorrelation/pymccorrelation.py#L129 and continuing through 149.

It might be possible to create a rank matrix using numpy vectorizing operations (w/ broadcasting).

For example a 2D rank matrix can be set up this way:

def m_comp(a, b):
     res = np.zeros((len(a), len(b)))
     for i in range(len(a)):
         for j in range(len(b)):
             res[i, j] = b[j] > a[i]
      return res

Or it can be vectorized with:

b[:] > a[:,np.newaxis]

The latter is 10x faster for a and b arrays that are 5 elements long and when N~1000, the vectorized version is about 600x faster

speed_scalings

Perhaps with some indexing, something like the latter could replace much of the for loops in the code?

privong commented 4 years ago

Failing a successful implementation of that (or in addition to), the bootstraps are independent and could be easily parallelized.

privong commented 3 years ago

For datasets of size of around 500 points and Nboot~100 numpy tries to allocate around 30 GB of memory for a vectorized method that is currently in testing. So vectorizing alone may not be a sufficient approach. The vectorized job could be split up and executed via multiprocessing.Pool.imap() (with chunksize set) to take advantages of both vectorization and multiple cores.