image-rs / imageproc

Image processing operations
MIT License
736 stars 145 forks source link

Refactor Gradients [Discussion] #507

Closed rljacobson closed 4 months ago

rljacobson commented 2 years ago

This PR is intended to facilitate discussion about a proposed refactor. It isn't a complete PR—probably won't even compile.

I suggest the gradient filter module & API be refactored to achieve the following:

A summary of the suggested design

  1. Have a GradientKernel (alt. named GradientMethod) enum with variants Sobel, Scharr, etc.
  2. Change the pub static HORIZONTAL_* and pub static VERTICAL_* kernels to be actual Kernel<i32> struct instances rather than [i32; 9] arrays.
  3. Member functions of GradientKernel return references to the appropriate pub static Kernel<i32> instance.
  4. Move the filtering functions into member functions of GradientKernel. Optionally have free function aliases that take the GradientKernel variant as a parameter. (The Sobel kernel would no longer be distinguished.)
  5. Change the function and purpose of gradient_map so that it applies a function to the gradient vector of each point to produce a single pixel result. Consequently, the member functions gradient_L1_norm, gradient_angle, gradient_L2_norm (with alias gradient_magnitude) trivially reduce to calls to gradient_map.
  6. Gradient operations only work on single channel images. Do away with the functionality currently being provided by gradient_map. (Justified by replacing multichannel functionality with better, more general imageprocfeature.)

More Details

The suggestion is to have a GradientKernel (alt. name could be GradientMethod) enum:

pub enum GradientKernel {
    Sobel,
    Scharr,
    Prewitt,
    Roberts
}

4. Move the filtering functions into member functions of GradientKernel.

The free functions in gradients.rs would be brought into GradientKernel member functions:

impl GradientKernel {
    pub fn horizontal_gradient(&self, image: &GrayImage) -> Image<Luma<i16>> {…}
    pub fn vertical_gradient(&self, image: &GrayImage) -> Image<Luma<i16>> {…}
    pub fn gradient_map<P, F, Q>(&self, image: &GrayImage, f: F) -> Image<Q> where … {…}
    pub fn gradient_magnitude(&self, image: &GrayImage) -> Image<Luma<i16>> {…}
    ⋮
}

5. Change the function and purpose of gradient_map

I'll argue for the existence of a new function in this section. I will argue for the deletion of the current version of gradient_map in the next section, as these are two distinct questions.

I want a function called gradient_map, but I do not want it to do what the current definition of gradient_map does. Instead, it makes sense to me to have a gradient_map function that maps a function over the gradient vector at each point to produce an output image. This would literally map a function onto the gradient, which strikes me as linguistically more accurate. Such a function would be useful to client code and would reduce the computations of gradient_angle and gradient_magnitude to the composition of gradient_map with the appropriate function of the gradient vector, which is satisfyingly orthogonal.

6. Gradient operations only work on single channel images

The current version of gradient_map enables an arbitrary function of an n-channel pixel to produce an m-channel pixel for the output. I speculate that this function exists because gradient computations are often needed to produce a "magnitude of rate of change at a point" from a color source image, and since kernel convolution in a single channel operation, there needed to be a way to go from three channels to one. But if magnitude of perceptual rate of change is what is of interest, then the color source image should be converted to luminance before a gradient operation is ever applied. (Note also that any linear function of channels commutes with a channel-wise gradient operation.) In my view, this functionality doesn't belong in the gradients module at all, and in any case it isn't well-described by the function name gradient_map.

This refactor is agnostic to whether and to where the existing functionality to move. But I'll say a few words about that anyway. Producing an image via a function of another image's channels can be done by the client code. If it is useful functionality and/or is cumbersome for client code to implement, the functionality should be implemented in a generic way in some other module, or in its own module. In fact, I would like to draft a second feature proposal independent of this one that implements fast generic high-level elementwise mathematical operations on images: add two images, multiply two images, find the max of two images. But this is a digression.

A gradient operation is fundamentally a single channel operation. Splitting and combining channels and images is a distinct collection of operations from gradient operations.

ripytide commented 4 months ago

It has been a while since this was first created and some of the points here may no longer be relevant, but I'll try to respond to as many as I can.

The suggestion is to have a GradientKernel (alt. name could be GradientMethod) enum: pub enum GradientKernel { Sobel, Scharr, Prewitt, Roberts }

That sounds like an excellent idea, having vertial_prewitt, vertical_scharr, vertical_sobel is quite unruly as they are differently named methods for different strategies for the same fundamental operation. This would be much more flexible and extensible if all strategies were contained in an enum like your suggested GradientKernel. Then users could simply do vertical_filter(&image, GradientKernel::Sobel) with a impl<Into Kernel> or vertical_filter(&image, Gradient::Sobel.as_array() without.

The free functions in gradients.rs would be brought into GradientKernel member functions:

This I strongly disagree with. It is not consistent with the rest of the library with its mostly free-standing functions which take parameters for configuring different options. See my above comment for how I think GradientKernel should be used with the regular free-standing functions.

theotherphil commented 4 months ago

@ripytide your suggestion sounds good to me. I’ll close this PR.