Add support for ndimage.map_coordinates

jni commented 3 years ago

This could be done in two stages:

Add support assuming that the number of coordinates is small and fits in worker memory. This is useful when grabbing a small number of coordinates from a large distributed array (which happens to be my use case! 😊), and should be relatively easy to do.
Add support when the coordinates array is also distributed. This is more challenging.

jakirkham commented 3 years ago

This seems like a map_overlap case. Guess the question is how much overlap (probably a function of spline order)?

Maybe I'm missing something, but what makes the Distributed case harder? It seems like this should be separable by chunks of coordinates. Or am I missing something?

GenevieveBuckley commented 3 years ago

This seems like a map_overlap case. Guess the question is how much overlap (probably a function of spline order)?

Yes, except that this is another case where the output isn't going to be the same shape as the input array, which complicates things. We had planned to have a discussion about this in general - I've sent an email to sort out scheduling for next week.

import numpy as np
import scipy.ndimage as ndi
import dask.array as da

image =  da.from_array(np.arange(100.).reshape((10, 10)), chunks=5)

ndi.map_coordinates(np.array(image), inds)
array([ 3.75660377, 24.        ])  # correct output from scipy

image.map_blocks(ndimage.map_coordinates, coordinates=inds, drop_axis=[0,1]).compute()
array([ 3.75660377, 24.        ])  # looks ok, but won't be robust to edge effects between array chunks

image.map_overlap(ndimage.map_coordinates, depth=3, coordinates=inds, drop_axis=[0,1]).compute()
array([18.36856823,  1.        ])  # nope!

GenevieveBuckley commented 3 years ago

Maybe I'm missing something, but what makes the Distributed case harder? It seems like this should be separable by chunks of coordinates. Or am I missing something?

I don't think you're missing anything, it seems like it should be achievable.

GenevieveBuckley commented 3 years ago

I think we need to add the overlap depth to the indices we're looking up, but I can't quite wrap my head around what boundary conditions we need.

GenevieveBuckley commented 3 years ago

Eg:

depth_value = 3
image.map_overlap(ndimage.map_coordinates, depth=depth_value, coordinates=inds+depth_value, boundary=0.0, drop_axis=[0,1]).compute()
array([ 5.09632784, 24.        ])  # still not correct

jni commented 3 years ago

@jakirkham the issue I'm pre-empting (but which might indeed not be super serious) is the coordinates I want to extract could be in any chunk, and presented in any order. So I can't just chunk my input, I have to somehow partition it according to the chunk in the raw data that I need to access.

Or I guess I could send all chunks from the coordinates to all chunks of the raw data. But that seems wasteful.

m-albert commented 3 years ago

I'd also think that "stage 2" is not easy to solve optimally. For the case of stage 1 (which should cover most use cases for now): The problem I'd see with directly applying map_overlap and dropping the image axes would be that the full image is gathered to be input to map_coordinates. So first the coordinates would need to be assigned to the chunks they map into.

However, applying map_overlap between a chunked coordinate array and the image would probably be inefficient, since many unmapped chunks would be loaded unnecessarily. For applications like this it would be nice to be able to mask entire chunks so that dask graphs could be culled smartly (as suggested by @GenevieveBuckley here https://github.com/dask/dask/issues/7652, see also https://github.com/dask/dask-image/issues/217).

Alternatively, for every coordinate the required input slice could be determined (see dask_image.ndinterp.affine_transform for how this depends on the order). Then the coordinates would need to be sent to different tasks, and determining how to do this best is probably a nontrivial optimization problem. A simple (though far from optimal) solution could be to group them by the image chunk they match to, which seems like a natural choice. Then for each group a task could be defined that precisely slices the input array (compared to map_overlap, this would also avoid constructing overlaps for the dimensions where it's not needed).

jakirkham commented 3 years ago

Yeah was thinking when I wrote this last night. This sounds kind of like a shuffle, but have been spending a fair bit of time on that use case so may just be seeing use cases that are not there.

m-albert commented 3 years ago

Yeah was thinking when I wrote this last night. This sounds kind of like a shuffle, but have been spending a fair bit of time on that use case so may just be seeing use cases that are not there.

True, seems like a similar use case!

Probably a not too bad approximation to the "stage 2" case could be to take a more or less optimized "stage 1" implementation as described above and apply a map_blocks over the input coordinate dask array. The scheduler would then take care of grouping tasks working on related image chunks.

m-albert commented 3 years ago

Probably a not too bad approximation to the "stage 2" case could be to take a more or less optimized "stage 1" implementation as described above and apply a map_blocks over the input coordinate dask array. The scheduler would then take care of grouping tasks working on related image chunks.

Thinking again probably this would not be quite as straight forward, as if the coordinates are given as a dask array, the coordinates themselves and how they map onto the input array wouldn't be known at schedule time. Maybe launching tasks from within tasks could be useful here.

jakirkham commented 3 years ago

Yeah that's why I'm thinking of shuffle as it is also making decisions based on values that are not known until computed. People have been spending a lot of time optimizing that problem. So we could benefit from that very easily if we are able to articulate our problem in those terms

m-albert commented 3 years ago

@jakirkham Do you have a link to some more details about the shuffle problem and how existing solution to it can be used?

In the meantime https://github.com/dask/dask-image/pull/237 suggests an implementation for use case 1.

dask / dask-image

Add support for ndimage.map_coordinates #235