clEsperanto / pyclesperanto_prototype

GPU-accelerated bio-image analysis focusing on 3D+t microscopy image data
http://clesperanto.net
BSD 3-Clause "New" or "Revised" License
201 stars 44 forks source link

axis auto-transposition between python and opencl #49

Open haesleinhuepf opened 3 years ago

haesleinhuepf commented 3 years ago

Hi future-self, hi @jni , hi fans of axis-transposition,

coming back to your recent comments on https://github.com/clEsperanto/pyclesperanto_prototype/pull/48:

At the moment, I'm often using code like

image = cle.push_zyx(np_arr)

result = cle.gaussian_blur(image, sigma_x=3, sigma_y=4, sigma_z=5)

I'll do that because I'm heavily confused when it comes to multi-dimensional arrays. You know, in Java we use array[x][y][z], in CLIJ-OpenCL, we do it similarily and in matlab and python it's array[z][y][x]. The explicit terminology helps me debugging and maintaining functionality. This explicit x,y,z terminology appears not so popular in python code (subjective impression).

I see we could change the API to:

result = cle.gaussian_blur(np_arr, [3, 4, 5])

or

result = cle.gaussian_blur(np_arr, [5, 4, 3])

(I'm really not sure!)

I'm just afraid that changing this would cause weird behaviour when people switch between eco-systems (like I do all the time). Thus, I'm wondering how to make things clear and ideally self-explanatory. Maybe even backwards-compatible.

Every best-practice, hint, suggestion is welcome. [How] do other libraries deal with such issues?

Cheers, Robert

jni commented 3 years ago

@haesleinhuepf I agree it's a mess and difficult, but for sure parameter names should not have any kind of implicit order that is not also encoded in the array. Specifically, the naming of axes is not present in any way in the NumPy array, or in the PyOpenCL Image either, I think. The only thing that we do know is the axis order, and so by using a list for the sigmas, which is itself ordered, we can keep track of which sigma corresponds to which axis. By contrast, using sigma_x=, I literally have no idea which axis you are talking about.

The clearest exposition of the arbitrariness of axis ordering I've come across is this one:

https://github.com/scikit-image/scikit-image/pull/3031#issuecomment-390941115

(You can scroll above it for more context, but the context is very long 😂.)

haesleinhuepf commented 3 years ago

I see. And I would love to agree. My only concern is that we speak so much about z-planes, x/y/z coordinates, rows, columns, image-width and -height. Thus, there should be something corresponding in the API. I'll see if I can make an optional/additional sigma parameter, wherever there are sigma_x, sigma_y and sigma_z. I think it's doable with some python magic 😉

haesleinhuepf commented 3 years ago

I thought about that for a while and came up with a potential solution: With this change in plugin_function (which annotates almost all functions in pyclesperanto), we can call methods like gaussian_blur in different ways:

The classical clij way, a.k.a. the language independent bare minimum:

cle.gaussian_blur(test, result, 1, 1, 0)

The keyword-way:

result = cle.gaussian_blur(test, sigma_x=1, sigma_y=1, sigma_z=0)

The array-way (Note, the array-order is z-y-x):

cle.gaussian_blur(test, result, sigma=[0, 1, 1])

The single number way

cle.gaussian_blur(test, result, sigma=1)

And all this works without changing the signature of gaussian_blur:

def gaussian_blur(source : Image, destination : Image = None, sigma_x : float = 0, sigma_y : float = 0, sigma_z : float = 0):

We would need to update the documentation of all affected methods.

What do you think about that @tlambert03 @jni ?

tlambert03 commented 3 years ago

hmm... while that "works", it's very hard to discover, since the signature won't show it at all:

# this is with your branch checked out
In [9]: cle.gaussian_blur?
Signature:
cle.gaussian_blur(
    source: Union[numpy.ndarray, pyopencl.array.Array, pyopencl._cl.Image],
    destination: Union[numpy.ndarray, pyopencl.array.Array, pyopencl._cl.Image] = None,
    sigma_x: float = 0,
    sigma_y: float = 0,
    sigma_z: float = 0,
)

Yes, you can document it in the docstring, but unless you add additional logic patch the __signature__ of the function you return from plugin_function then you essentially end up kind of "lying" to python about the function itself (which makes introspection with things like inspect.signature, as shown above when getting help in IPython, kinda broken)

I don't immediately have a suggestion. But I would be very cautious about changing the effective signature of a function without also making sure that inspect.signature still gives a reliable report of the signature.

(also, btw... we should also update this function from getfullargspec to use the newer inspect.signature API recommended in 3.5+)

haesleinhuepf commented 3 years ago

Yeah, I also don't like the shadowed API very much. I nevertheless hope that some python-magic does the trick. I'm also not aware of any Java-tricks of similar kind. Python is more powerful in this aspect. Hence I vote for fixing this on python side.

The "correct" fix for this would be changing the root of those issues in all opencl kernels, which is associating x/y/z with dimension-numbers ([e.g. here](https://github.com/clEsperanto/clij-opencl-kernels/blob/master/kernels/absolute_3d_x.cl#L7-L()):

const int x = get_global_id(0);
const int y = get_global_id(1);
const int z = get_global_id(2);

(x and z would have to be exchanged)

After making that change, chaning all python-wrappers and all C++ wrappers and all Java-wrappers, users would complain "Why is it now zyx and no longer xyz?" I would also suspect that maximum-z-projections become faster and maximum-x-projections slower as the images would be arranged differently in memory. I'm afraid changing this at the root would cause more trouble than it solves. If we find a magic-python solution, that would be great.

tlambert03 commented 3 years ago

I nevertheless hope that some python-magic does the trick.

we can definitely do some magic to improve this here. At the heart of it is the __signature__ attribute, which is what python uses to determine & report the signature a function (And which can be set dynamically). Here's an example:

from inspect import signature, Signature, Parameter

def f(*args, **kwargs):
    pass

print(signature(f))
# prints "(*args, **kwargs)"

f.__signature__ = Signature(
    [
        Parameter('x', Parameter.POSITIONAL_OR_KEYWORD, annotation=int),
        Parameter('y', Parameter.POSITIONAL_OR_KEYWORD, annotation=str, default='hi'),
    ]
)

print(signature(f))
# prints "(x: int, y: str = 'hi')"

as a sidenote... this string provided by the Signature object is awesome, and should probably be used in your code generation in assistant. More on that in the near future :)

Anyway, I think you're on the right general track here... but I'm not sure we want to go ahead and merge this just yet (and I don't think it's urgent?)

tlambert03 commented 3 years ago

with that trick being said... I'm not sure the ultimate fix here lies in one super flexible signature. I still think some sort of compatibility layer that converts between a few different functions (with non-dynamic signatures) might be the way to go.

haesleinhuepf commented 3 years ago

and I don't think it's urgent?

No, that branch was just a way to express how one could do it. No hurry.

compatibility layer that converts between a few different functions

You mean something like this?

def pyclesperanto.scipy.gauss_filter(input, sigma, output=None)
    return pyclesperanto.gaussian_blur(input, output, sigma[2], sigma[1], sigma[0])

I would favor something like this :-)

tlambert03 commented 3 years ago

exactly something like that :)

jni commented 3 years ago

I don't mind where the conversation has gone but just want to object to:

My only concern is that we speak so much about z-planes, x/y/z coordinates, rows, columns, image-width and -height. Thus, there should be something corresponding in the API.

I don't think this is true, and in fact I think that this is the source of much confusion, because you will find, over and over, that users are saying z when they mean x, or vice-versa, and you just have no way of knowing when they are talking about one or the other. This can be the source of many frustrating bugs, as well as performance bugs where people rearrange their array memory layout for no good reason. The only way to avoid this is if the array object itself has some annotation of which axis is called what, as xarray does.

haesleinhuepf commented 3 years ago

that users are saying z when they mean x

I've never met any user like this. Really. At least routine ImageJ users would never mix those two up. 🤣 But when you switch to matlab or python, those confusion appears. So you see who my target audience is and why I'm having those terms in the API. literally everywhere. Nevertheless, we should make sure that also people from the python side can use clesperanto conveniently.

jni commented 3 years ago

At least routine ImageJ users would never mix those two up.

Correct me if I'm wrong, but don't ImageJ/ImagePlus objects have a strict definition of what xyz are? Or it is consistent throughout the library? If your array object is itself well-annotated, all is well, hence my reference to xarray. But in the case of clesperanto, you have as input an array, which has no concept of axis names, and then sigma_z etc, which, seriously, how am I supposed to know whether this refers to the first or last axis?

haesleinhuepf commented 3 years ago

Correct! In ImageJ, typically the axis order is fixed XYCZT. However, there are no 5D-Filters, or at least no easy-to-use ones in ImageJ/Fiji. Everything is basically fixed 2D or 3D and you always enter sigmas in X, Y and Z as shown here: image

In CLIJ I adapted the same concept so that users have no issues in making the transition: image

And so it is in clesperanto / napari: image

I'm pretty simple minded in this apect. ;-)

If your array object is itself well-annotated

I think you're indirectly answering a question I have in mind for quite some time: How can I deal with voxel size in python? skimage.io.imread apparently doesn't read it from tif files. But xarray does? I can just speculate, because xarray doesn't install on windows with similar errors like pyopencl. But if you say it's worth exploring, I'd like to invest some time and go down that road.

how am I supposed to know whether this refers to the first or last axis?

clij/clesperanto's opencl-kernels are built on the concept of XYZ, almost all of them contain this piece of code specifying the dimension order:

  const int x = get_global_id(0);
  const int y = get_global_id(1);
  const int z = get_global_id(2);

Thinking of this in the context of xarray: If xarray allows me to determine from an array object which axis corresponds to X, Y and Z, we could build in some logic into the push-command that transposes axes if necessary. That doesn't solve our x=1, y=2, z=3 versus shape=[3, 2, 1] problem, but it might be a step in the right direction. What do you think @jni?

tlambert03 commented 3 years ago

I think you're indirectly answering a question I have in mind for quite some time: How can I deal with voxel size in python?

just addressing this side question... I think there's two questions here, one is simply how to retrieve voxel data from the tiff header... and the other is how it is incorporated into some model or API. For the former, I don't think you can get it from skimage? (please correct me if wrong @jni) You can get it directly with the TiffFile class in tifffile (look through examples starting here), or, if you're interested, I have a well-typed pure python implementation of the OME data model (ome-types) that gives you nice objects to work with in the python world:

pip install ome-types
from ome_types import from_tiff
meta = from_tiff("path/to/file.ome.tiff")
print(meta.images[0].pixels.physical_size_x)
print(meta.images[0].pixels.physical_size_x_unit)

for the second issue... there really is no universally agreed upon metadata model in python (xarray doesn't provide it, they just let you annotate your axes as you see fit). It's something we struggle with, since we obviously need some sort of internal metadata model to use in napari, but I don't think anybody wants to limit ourselves to the 5D XYCZT model. So the answer for at least the short term is probably going to be "grab the metadata from your files however you like, and put them where you like"

jni commented 3 years ago

I'm pretty simple minded in this aspect. ;-)

Well, as I mentioned, to me that is more confusing. One way to think about it though, is that z should be the least-contiguous axis. So maybe introspecting arrays is the "best" way to think about auto-transposition in this context?

But I still think it is not something that translates well between ecosystems, and, to paraphrase @tlambert03, we should aim to speak snake in snake-land. ;)

How can I deal with voxel size in python?

@tlambert03 is right, there's no way to read it in skimage. (Some functions take in a spacing= keyword argument, but it's not many.) There is no unified way across the ecosystem and I think the best bet right now is to watch/participate in (1) the OME-NGFF spec: https://github.com/ome/ngff, and (2) the new imageio API and metadata discussions (https://github.com/imageio/imageio/issues/362, https://github.com/imageio/imageio/issues/569, https://github.com/imageio/imageio/pull/574, and links therein). The goal is that skimage.io will be a tiny shim over the next imageio API.

FirefoxMetzger commented 3 years ago

A random comment from a random person passing by.

For a generic image processing library, the only suitable way to name axes and - by extension - parameters related to these axes is to not name axes. Any naming convention that implies meaning beyond in-memory order is opinionated. Being opinionated will exclude some users from the library, and is not what a generic library tries to do.

For a highly specialized library - that aims to do one thing really, really well - it makes perfect sense to be opinionated because it often means considerable performance gain from further optimizations. For example, cuDNN has a quite strong assumption about image dimensionality: NCHW ([batch_dims, channel, height, width]). The reasoning here is better memory alignment, which is even more crucial on the GPU than it is on the CPU. Especially since GPUs are I/O bottlenecked thanks to the availability of tensor cores (which apparently even speed up FMA ... crazy times).

That said, I tried to work out what this library is trying to accomplish - which is actually pretty ambitious -, and couldn't quite figure out where to put it. Indeed, the discussion here reflects that since there are some arguments towards being a generic library and other arguments towards being a biomedical image processing library (with a focus on volumetric data?). The former seems to argue "generic image processing, but with GPU acceleration" the latter seems to argue "same syntax across commonly used biomedical tools".

Both aims differ on this particular point, as a generic library would not use suggestive names to remain generic, whereas a specialized library would to assimilate syntax. I guess a good question to ask here is where the focus lies, as being clear on that would make it clear which option to prefer?