Auto-guessing conversion of numpy arrays to scalar image vs. to tensor image seems fragile

anntzer commented 2 years ago

Component pydip 3.3.0

Describe the bug When given a 3D numpy array, diplib can convert it either to a scalar image or to a tensor image depending on the exact shape; the logic is at https://github.com/DIPlib/diplib/blob/7600ba4519432e176cc14327cf4d92b0d0f63b57/pydip/src/image.cpp#L142-L151 Unfortunately that means that some inputs become unsuitable for downstream functions, e.g. (the actual case I ran into) diplib.EuclideanSkeleton(three_dim_array) will fail iff the first or last dim of the array is <10, which is rather confusing (AFAICT, the current workaround is to write diplib.EuclideanSkeleton(diplib.Image(three_dim_array, None))).

If you really want to keep that guessing behavior (it's hard for me to judge whether it's appropriate), perhaps(?) the python bindings for diplib.EuclideanSkeleton (and similar functions) could go through a small wrapper function with a custom type declaration for in that always converts to scalar images, not to tensor images.

To Reproduce If applicable, include complete, runnable code to reproduce the behavior.

import diplib as d, numpy as np
d.EuclideanSkeleton(np.ones((9, 9, 9), bool))  # fails
d.EuclideanSkeleton(np.ones((10, 10, 10), bool))  # works

System information:

What OS do you have? fedora37
If PyDIP, what version of Python do you have? 3.10
If DIPimage, what version of MATLAB do you have? (type ver at the MATLAB command prompt, and copy-paste the result here).
If compiled from sources, what compiler and version did you use?

crisluengo commented 2 years ago

Yes, this is awkward. We have discussed it earlier in #25.

The thing is that without this guessing, you'd get unexpected behavior when working with color images. Fact is, putting a multi-valued image into a plain array leaves us with an ambiguity.

I don't know where the <10 limit came from, there could be an argument for this to be <=4 (color images typically have 3 or 4 channels). We could also make this into a global variable that the user can set (and set to 0 to always produce scalar images).

To have functions that work only on scalar images do a different casting would be a lot of work, and it would introduce confusion (sometimes conversion is different) and uncertainty (does this function convert this way or the other?). I'm not keen on implementing that.

Arguably the better solution is to explicitly cast arrays to images before using a DIPlib function. Because the DIPlib function will return a dip.Image anyway, it might be worth while to separate the sections of code using DIPlib and those not using DIPlib by using explicit casts at the start and end of the section.

crisluengo commented 1 year ago

I hope this is a better compromise.

If you wish, you can now do dip.SetTensorConversionThreshold(0) to always convert arrays to scalar images.

anntzer commented 1 year ago

Thanks for providing such a knob, it should be helpful here.

anntzer commented 1 year ago

@crisluengo I still gave a try at tagging EuclideanSkeleton so that when given a numpy array it never treats it as a tensor image (as that interpretation can never be correct). The following patch seems to work for me:

diff --git a/pydip/src/image.cpp b/pydip/src/image.cpp
index 80219bf6..1a6e10b9 100644
--- a/pydip/src/image.cpp
+++ b/pydip/src/image.cpp
@@ -577,6 +577,13 @@ void init_image( py::module& m ) {
    img.def( "__lt__", []( dip::Image const& lhs, py::object const& rhs ) { return dip::Lesser( lhs, ImageOrPixel( rhs )); }, py::is_operator() );
    img.def( "__le__", []( dip::Image const& lhs, py::object const& rhs ) { return dip::NotGreater( lhs, ImageOrPixel( rhs )); }, py::is_operator() );

+   auto si = py::class_< dip::detail::ScalarImage >( m, "_ScalarImage", "A thin wrapper class for scalar images" );
+   // Constructor that takes a Python raw buffer
+   si.def( py::init( []( py::buffer& buf ) { return dip::detail::ScalarImage{ BufferToImage( buf, false ) }; } ) );
+   // If given a diplib Image, we take it as is without forcing it to be scalar.
+   si.def( py::init( []( dip::Image& img ) { return dip::detail::ScalarImage{ img }; } ) );
+   py::implicitly_convertible< py::buffer, dip::detail::ScalarImage >();
+
    // Some new functions useful in Python
    m.def( "Create0D", []( dip::Image::Pixel const& in ) -> dip::Image {
              return dip::Image( in );
diff --git a/pydip/src/morphology.cpp b/pydip/src/morphology.cpp
index c7cfc1a8..c1bba307 100644
--- a/pydip/src/morphology.cpp
+++ b/pydip/src/morphology.cpp
@@ -346,9 +346,13 @@ void init_morphology( py::module& m ) {
    m.def( "BinaryAreaClosing", py::overload_cast< dip::Image const&, dip::Image&, dip::uint, dip::uint, dip::String const& >( &dip::BinaryAreaClosing ),
           "in"_a, py::kw_only(), "out"_a, "filterSize"_a = 50, "connectivity"_a = 0, "edgeCondition"_a = dip::S::BACKGROUND );

-   m.def( "EuclideanSkeleton", py::overload_cast< dip::Image const&, dip::String const&, dip::String const& >( &dip::EuclideanSkeleton ),
+   m.def( "EuclideanSkeleton",
+          [](dip::detail::ScalarImage const& in, dip::String const& endPixelCondition, dip::String const& edgeCondition )
+          { return dip::EuclideanSkeleton(in.image, endPixelCondition, edgeCondition); },
           "in"_a, "endPixelCondition"_a = dip::S::NATURAL, "edgeCondition"_a = dip::S::BACKGROUND );
-   m.def( "EuclideanSkeleton", py::overload_cast< dip::Image const&, dip::Image&, dip::String const&, dip::String const& >( &dip::EuclideanSkeleton ),
+   m.def( "EuclideanSkeleton",
+          [](dip::detail::ScalarImage const& in, dip::Image& out, dip::String const& endPixelCondition, dip::String const& edgeCondition )
+          { return dip::EuclideanSkeleton(in.image, out, endPixelCondition, edgeCondition); },
           "in"_a, py::kw_only(), "out"_a, "endPixelCondition"_a = dip::S::NATURAL, "edgeCondition"_a = dip::S::BACKGROUND );

    m.def( "CountNeighbors", py::overload_cast< dip::Image const&, dip::uint, dip::String const&, dip::String const& >( &dip::CountNeighbors ),
diff --git a/pydip/src/pydip.h b/pydip/src/pydip.h
index 3bfe87b7..76fd73df 100644
--- a/pydip/src/pydip.h
+++ b/pydip/src/pydip.h
@@ -46,6 +46,13 @@ namespace py = pybind11;
  *
  */

+namespace dip {
+   namespace detail {
+      struct ScalarImage {
+         dip::Image image;  // NOTE: I am not sure about diplib's ownership semantics.
+      };
+   }
+}

 // Declaration of functions in other files in the interface
 void init_image( py::module& m );

Does that seem reasonable to you? If yes I can (slowly...) give a try at tagging all relevant functions to use ScalarImage at the numpy->diplib conversion stage.

crisluengo commented 1 year ago

That is a clever way of changing that behavior.

But I'm not thrilled by the idea of the divergent behavior. You put some array as input to the skeleton, and get a 3D scalar image out. Then you put that same array as input to something else, say a Gaussian filter, and you get a 2D multi-channel image out. I think that's surprising behavior.

I'd rather encourage people to explicitly cast their arrays to dip.Image objects. Making the conversion explicit will remove all ambiguity and all surprises.

anntzer commented 1 year ago

I would say that what's more surprising behavior is that EuclideanSkeleton works perfectly fine when given a (10, 10, 10) array and not when given a (3, 3, 3) array. (In my case these were volume patches generated by some other code, and in fact during initial testing I never put in such a small array and only ran into the crash when a colleague tried the code on a bigger dataset.) Obviously now I know better and can rely on explicit conversion and/or call SetTensorConversionThreshold, but I would guess that I won't be the only person tripped by this. Different behaviors between different functions (EuclideanSkeleton vs. Gaussian) seem less problematic to me (although not perfect) than different behaviors at a non-trivial size cutoff.

OTOH, coming from the Python world, I certainly don't have much intuition on the diplib data model, so I'm OK with just leaving the discussion at that.

crisluengo commented 1 year ago

Yes, I agree, it's surprising behavior when an array is converted to a color image if you don't expect it. But it's also surprising if an array is not converted to a color image if you do expect it.

In your example, of a 3x3x3 patch being passed to a function, the same problem would have happened if you passed it to dip.Gauss() and expected smoothing along all three dimensions, but got smoothing only along two. Actually, you should be happy that you ran into this problem with a function that takes only scalar images, because you got an error message instead of subtly wrong data!

I think the best solution is to be consistent with how arrays are converted to images, and describe that behavior in the documentation (which we're already done). I might be wrong, I'm not a novice user, and probably not a typical user either. I'm sure I'll change my mind if more people complain about this. ;)

DIPlib / diplib

Auto-guessing conversion of numpy arrays to scalar image vs. to tensor image seems fragile #125