Open apasto opened 1 year ago
Updating on this with some ideas (disregarding that almost 2 years passed since the ones above :sweat_smile: )
Let's consider this (and the serial just above this, where we could also apply the same):
(for the record: we may get rid of pool.apply
and kwds
, see #5 )
We are passing 'a': A.to_numpy(), 'b': B.to_numpy()
to each worker.
Quite trivially, we may slice before, pass afterwards - just the window:
def extract_window(a, e_i, hw_y_i, hw_x_i):
return a[e_i[0] - hw_y_i: e_i[0] + hw_y_i + 1, e_i[1] - hw_x_i: e_i[1] + hw_x_i + 1]
kwds={
'a': extract_window(A.to_numpy(), element, window_halfwidth_y_i, window_halfwidth_x_i),
'b': extract_window(A.to_numpy(), element, window_halfwidth_y_i, window_halfwidth_x_i),
'e_i': element,
'hw_x_i': window_halfwidth_x_i,
'hw_y_i': window_halfwidth_y_i
}
As of 77b9571d1f9126a195535799649d8425cfd81cdb (implementing parallel calls to regression), slicing the A, B arrays to extract the rolling window is done inside
wrap_linregress()
, which is then called withpool.apply()
.I am under the impression that slicing before and passing the contents of slices to each worker would be less wasteful, memory-wise. This could be implemented by adding a 3rd dimension to two A, B-like arrays and assigning a slice to each vector along this new dimension (thanks to @pogmat for his precious advice - hopefully I have got it right).