JuliaImages / ImageBinarization.jl

A Julia package of algorithms for analyzing images and automatically binarizing them into background and foreground.
MIT License
35 stars 8 forks source link

Swap arguments? #23

Closed juliohm closed 5 years ago

juliohm commented 5 years ago

The package looks great, thanks for the contribution!

Would it make sense to swap the arguments? Most Julia APIs I've came across adopt the following convention:

binarize(img, alg)

as opposed to

binarize(alg, img)

I read the first version as "binarize the image with algorithm", the second version doesn't have a clear "spelling", at least for me.

I adopted the above mentioned convention in ImageInpatining.jl for example, and in the internals of GeoStats.jl in multiple places.

zygmuntszpak commented 5 years ago

Thanks for the warm feedback Julio! I've been trying to follow the convention originally outlined here: https://github.com/JuliaImages/Images.jl/issues/772

I actually incorporated your suggestion

Thank you @zygmuntszpak for the thoughtful and exciting proposal. I have a quick question. Are the params in the line

adjust_histogram(op::Equalization, params...)

the parameters of the Equalization operation? Couldn't the parameters be encapsulated in the struct?

struct Equalization <: AbstractHistogramOperation param1 param2 end

Specifically, I've moved any parameters that control the binarization as fields of the struct.

I was intending to do the same in the ImageConstratAdjustment (https://github.com/zygmuntszpak/ImageContrastAdjustment.jl) package.

My vision for the Julia API was that we have a few key operational words, such as binarize or adjust_histogram and that the first function argument would specify which particular algorithm you wanted to select.

If you want to put the img argument first, does that mean you would also prefer:

adjust_histogram(img, Equalization(nbins = 256))

instead of

adjust_histogram(Equalization(nbins = 256),img)

Would love to hear your thoughts as well: @timholy @johnnychen94

juliohm commented 5 years ago

Thank you @zygmuntszpak for the initiative, I am really looking forward to this general API. Answering the question, I believe that having the img as the first argument reads better. Also, if we decide to incorporate ComputationalResources.jl as it is done in other packages like ImageFiltering.jl, we could do:

some_func(resource, img, algo)

So the computational resource comes first, then image, then algorithm. By default, no computational resource is specified, and we could fallback to

some_func(img, algo)

without any possible confusion between "resource" types and "algorithm" types. It may be that choosing the other order can confuse users of what is an algorithm and what is a resource? It would be nice if we could keep these two far away in the list of arguments.

zygmuntszpak commented 5 years ago

I was planning on having the computational resource be specified as a keyword argument. If we follow your suggested convention, then it would look like this:

some_func(img, algo; resource = GPU())

If left unspecified then a default value of resource = CPU() would be utilised.

juliohm commented 5 years ago

Do we have dispatch on keyword arguments in Julia v1.0? In general is it good for performance?

I'm not familiar with the internals of the language. Looking forward to hearing from other members what their opinions are.

On Sat, Mar 9, 2019, 21:21 Dr. Zygmunt L. Szpak notifications@github.com wrote:

I was planning on having the computational resource be specified as a keyword argument. If we follow your suggested convention, then it would look like this:

some_func(img, algo; resource = GPU())

If left unspecified then a default value of resource = CPU() would be utilised.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/zygmuntszpak/ImageBinarization.jl/issues/23#issuecomment-471234539, or mute the thread https://github.com/notifications/unsubscribe-auth/ADMLbcUJAwgrWgTk-ApeHTlu1aCDzEDcks5vVFAHgaJpZM4bm7-_ .

johnnychen94 commented 5 years ago

@juliohm No we only have method dispatch on positional parameters.

https://docs.julialang.org/en/v1/manual/methods/#Note-on-Optional-and-keyword-Arguments-1

Keyword arguments behave quite differently from ordinary positional arguments. In particular, they do not participate in method dispatch. Methods are dispatched based only on positional arguments, with keyword arguments processed after the matching method is identified.


I'm not very familiar with GPU computing details. But if both works as I expect, I vote for the second method -- juliohm's.

For cases we need delicate hybrid CPU and GPU conversion, we could always implement using the first method.


Method 1

zygmuntszpak's idea:

some_func(img, algo; resource = GPU())

requires a manual check on types of resource to make dispatch, e.g.,

function some_func(img, algo; resource = CPU())
    if resource isa GPU
        img = GPUArray(img)
    end

    # implementation details

    if resource isa GPU
        rst_img = collect(rst_img)
    end    
end

Method 2

juliohm's idea:

some_func(resource, img, algo)

doesn't, e.g.,

some_func(resource::GPUArray, img, algo) = some_func_general(GPUArray(img),algo) |> collect
some_func(resource, img, algo) = some_func_general(img,algo)
function some_func_general(img,algo)
    # implementation details
end

zygmuntszpak commented 5 years ago

I'd like to amend Johnny's description of Method 1. I would probably implement it as follows

abstract type AbstractComputationalDevice end
struct CPU <: AbstractComputationalDevice end
struct GPU <: AbstractComputationalDevice end
some_function(img, algorithm; resource::AbstractComputationalDevice = CPU())
    an_implementation(img, algorithm, resource)
end
function an_implementation(img, algorithm, resource::CPU)
 # CPU specific implementation of algorithm
end
function an_implementation(img, algorithm, resource::GPU)
 # GPU specific implementation of algorithm
end

With regard to Method 2, shouldn't the resource be the very last argument to the function and not the first? My understanding of Julio's "spelling" request is that he would like to read it aloud as something like: "binarize the image with algorithm running on CPU" or "binarize the image with algorithm running on GPU".

@evizero Would love to hear your input on this as well.

juliohm commented 5 years ago

I like this approach as well @zygmuntszpak :+1: we could even standardize the name of these implementation functions:

function binarize(img, algo; resource=CPU())
  binarize_impl(img, algo, resource)
end

Initially I thought of putting the resource as the first argument following the recommendations in ComputationalResources.jl, but I think your approach with *_impl functions reads well, is clear and performant.

Evizero commented 5 years ago

With regard to Method 2, shouldn't the resource be the very last argument to the function and not the first?

i think to remember having this discussion with Prof. Holy somewhere a long time ago, and the argument he brought up (if memory serves me) was that the mental model behind having the resource first was that it had conceptual similarities with a function. Also there was the issue of different functions needing a different number of additional arguments, and then the question of ordering comes up.

Personally I think its worth rethinking this. Especially the second point is not as convincing now that keyword arguments for parameters you don't need to dispatch on don't penalize your performance anymore.

but I think your approach with *_impl functions reads well

I think in your example I would prefer the same function name being used. Throughout JuliaML i often allow things to be specified either as keyword or as positional argument and I found this to be reasonably intuitive if its done in moderation.

johnnychen94 commented 5 years ago

If we following this principle of priority:

The validness of this principle comes from the intention of specifying computational resource, otherwise we don't need the AbstractComputationalDevice argument.


One problem with method 1 proposed by @zygmuntszpak is

some_function(gpu(img), algo) will call the cpu version implementation, since kwargs are not counted in method dispatch. This conflicts with the principle of priority -- we didn't pass CPU argument

But the impl* idea is great 👍. With this idea, method 2 could be

some_function(img::GPUArray, algo::AbstractAlgorithm) = some_function_impl(GPU(), img, algo)
some_function(img::AbstractArray, algo::AbstractAlgorithm) = some_function_impl(img, algo)
some_function(resource::CPU, img::AbstractArray, algo::AbstractAlgorithm) 
    = some_function_impl(resource, gather(img), algo)
some_function(resource::GPU, img::AbstractArray, algo::AbstractAlgorithm) 
    = some_function_impl(resource, gpu(img), algo)

some_function_impl(resource::CPU, img, algorithm) = some_function_impl(img, algorithm)
function some_function_impl(resource::GPU, img, algorithm)
 # GPU specific implementation of algorithm
end
function some_function_impl(img, algorithm)
 # generic implementation -- in most cases it can be CPU implementation
end

Users now have two ways to call the function

and this follows the principle of priority


For this reason, I still vote for some_function(resource, img, algo) since this also meets the convention according to words from Juliohm and Evizero.

juliohm commented 5 years ago

Thank you for your thoughts @johnnychen94 , it is really nice to see these alternatives carefully compared. I would like to hear from Prof. @timholy what his opinion is on this general API we are discussing. He will definitely enlighten us.

The *_impl suffix I proposed comes from my background contributing to some Boost C++ libraries. There you can find this _impl pattern everywhere. Of course Julia != C++ so we need to think carefully about it. :+1:

johnnychen94 commented 5 years ago

If removing the *_impl, it looks more concise.

some_function(img::GPUArray, algo::AbstractAlgorithm) = some_function(GPU(), img, algo)
some_function(img::AbstractArray, algo::AbstractAlgorithm) = some_function(CPU(), img, algo)

some_function(resource::GPU, img::AbstractArray, algorithm::AbstractAlgorithm) = 
    some_function(resouce, gpu(img), algorithm)
some_function(resource::CPU, img::GPUArray, algorithm::AbstractAlgorithm) = 
    some_function(resource, Array(img), algorithm)
function some_function(resource::GPU, img::GPUArray, algorithm::AbstractAlgorithm)
 # GPU specific implementation of algorithm
end
function some_function(resource::CPU, img::AbstractArray, algorithm::AbstractAlgorithm)
# generic CPU implementation
end
johnnychen94 commented 5 years ago

Some crazy thoughts, since we pack arguments into the concrete algo, we could even support arbitrary argument order:

Potential problems of this:

Implementation code The following is my experiments, we could create a macro to do all these trivial work, and focus on implementing the last two(or four) methods. Note: * in this implementation, RHS of each method follows the `where`-`what`-`how` order * in this implementation, one-argument is supported, they're cases that this shouldn't be supported ```julia using ImageCore, Colors using CuArrays, GPUArrays abstract type AbstractAlgorithm end #implementation independent abstract type AbstractDenoiseAlgorithm <: AbstractAlgorithm end struct BM3D <: AbstractDenoiseAlgorithm end struct TVL1 <: AbstractDenoiseAlgorithm end struct TVL2 <: AbstractDenoiseAlgorithm end DEFAULT_ALGORITHM = BM3D() ### begin of implementation independent abstract type AbstractComputationalDevice end struct CPU <: AbstractComputationalDevice end struct GPU <: AbstractComputationalDevice end # 1 argument - img is alway required denoise(img::AbstractArray{T,2}) where T = denoise(img, DEFAULT_ALGORITHM) # 2 arguments - resource missing denoise(algo::AbstractAlgorithm, img::AbstractArray{T,2}) where T = denoise(img, algo) denoise(img::GPUArray{T,2}, algo::AbstractAlgorithm) where T = denoise(GPU(), img, algo) denoise(img::AbstractArray{T,2}, algo::AbstractAlgorithm) where T = denoise(CPU(), img, algo) # 2 arguments - algo missing denoise(img::AbstractArray{T,2}, resource::AbstractComputationalDevice) where T = denoise(resource, img) denoise(resource::AbstractComputationalDevice, img::AbstractArray{T,2}) where T = denoise(resource, img, DEFAULT_ALGORITHM) # 3 arguments denoise(algo::AbstractAlgorithm, resource::AbstractComputationalDevice, img::AbstractArray{T,2}) where T = denoise(resource, img, algo) denoise(algo::AbstractAlgorithm, img::AbstractArray{T,2}, resource::AbstractComputationalDevice) where T = denoise(resource, img, algo) denoise(img::AbstractArray{T,2}, algo::AbstractAlgorithm, resource::AbstractComputationalDevice) where T = denoise(resource, img, algo) denoise(img::AbstractArray{T,2}, resource::AbstractComputationalDevice, algo::AbstractAlgorithm) where T = denoise(resource, img, algo) denoise(resource::AbstractComputationalDevice, algo::AbstractAlgorithm, img::AbstractArray{T,2}) where T = denoise(resource, img, algo) ### end of implementation independent # actual implementation denoise(resource::CPU, img::GPUArray{T,2}, algo::BM3D) where T = denoise(resource, Array(img), algo) denoise(resource::GPU, img::AbstractArray{T,2}, algo::BM3D) where T = denoise(resource, CuArray(img), algo) function denoise(resource::CPU, img::AbstractArray{T,2}, algo::BM3D) where T println("denoise noise using CPU with BM3D") end function denoise(resource::GPU, img::GPUArray{T,2}, algo::BM3D) where T println("denoise noise using GPU with BM3D") end denoise(resource::CPU, img::GPUArray{T,2}, algo::TVL1) where T = denoise(resource, Array(img), algo) denoise(resource::GPU, img::AbstractArray{T,2}, algo::TVL1) where T = denoise(resource, CuArray(img), algo) function denoise(resource::CPU, img::AbstractArray{T,2}, algo::TVL1) where T println("denoise noise using CPU with TVL1") end function denoise(resource::GPU, img::GPUArray{T,2}, algo::TVL1) where T println("denoise noise using GPU with TVL1") end ```
all possible way of calling this function works as expected: test code: ```julia cpu_img = ones(3,3) gpu_img = ones(3,3) |> CuArray # 1 argument println("1 argument with Array") denoise(cpu_img) # cpu with BM3D println("1 argument with GPUArray") denoise(gpu_img) # gpu with BM3D println() # 2 arguments println("2 arguments with unspecified resource") println("CPUArray") denoise(cpu_img, TVL1()) # cpu with TVL1 denoise(TVL1(), cpu_img) # cpu with TVL1 println("GPUArray") denoise(gpu_img, TVL1()) # gpu with TVL1 denoise(TVL1(), gpu_img) # gpu with TVL1 println() println("2 arguments with CPU resource") denoise(cpu_img, CPU()) # cpu with BM3D denoise(CPU(), cpu_img) # cpu with BM3D denoise(CPU(), gpu_img) # cpu with BM3D denoise(gpu_img, CPU()) # cpu with BM3D println("2 arguments with GPU resource") denoise(cpu_img, GPU()) # gpu with BM3D denoise(GPU(), cpu_img) # gpu with BM3D denoise(GPU(), gpu_img) # gpu with BM3D denoise(gpu_img, GPU()) # gpu with BM3D println() # 3 arguments println("3 arguments") println("cpu_img with CPU resource:") denoise(cpu_img, TVL1(), CPU()) # cpu with TVL1 denoise(cpu_img, CPU(), TVL1()) # cpu with TVL1 denoise(CPU(), cpu_img, TVL1()) # cpu with TVL1 denoise(CPU(), TVL1(), cpu_img) # cpu with TVL1 denoise(TVL1(), CPU(), cpu_img) # cpu with TVL1 denoise(TVL1(), cpu_img, CPU()) # cpu with TVL1 println() println("cpu_img with GPU resource:") denoise(cpu_img, TVL1(), GPU()) # gpu with TVL1 denoise(cpu_img, GPU(), TVL1()) # gpu with TVL1 denoise(GPU(), cpu_img, TVL1()) # gpu with TVL1 denoise(GPU(), TVL1(), cpu_img) # gpu with TVL1 denoise(TVL1(), GPU(), cpu_img) # gpu with TVL1 denoise(TVL1(), cpu_img, GPU()) # gpu with TVL1 println() println("gpu_img with CPU resource:") denoise(gpu_img, TVL1(), CPU()) # cpu with TVL1 denoise(gpu_img, CPU(), TVL1()) # cpu with TVL1 denoise(CPU(), gpu_img, TVL1()) # cpu with TVL1 denoise(CPU(), TVL1(), gpu_img) # cpu with TVL1 denoise(TVL1(), CPU(), gpu_img) # cpu with TVL1 denoise(TVL1(), gpu_img, CPU()) # cpu with TVL1 println() println("gpu_img with GPU resource:") denoise(gpu_img, TVL1(), GPU()) # gpu with TVL1 denoise(gpu_img, GPU(), TVL1()) # gpu with TVL1 denoise(GPU(), gpu_img, TVL1()) # gpu with TVL1 denoise(GPU(), TVL1(), gpu_img) # gpu with TVL1 denoise(TVL1(), GPU(), gpu_img) # gpu with TVL1 denoise(TVL1(), gpu_img, GPU()) # gpu with TVL1 println() ``` output: ```text 1 argument with Array denoise noise using CPU with BM3D 1 argument with GPUArray denoise noise using GPU with BM3D 2 arguments with unspecified resource CPUArray denoise noise using CPU with TVL1 denoise noise using CPU with TVL1 GPUArray denoise noise using GPU with TVL1 denoise noise using GPU with TVL1 2 arguments with CPU resource denoise noise using CPU with BM3D denoise noise using CPU with BM3D denoise noise using CPU with BM3D denoise noise using CPU with BM3D 2 arguments with GPU resource denoise noise using GPU with BM3D denoise noise using GPU with BM3D denoise noise using GPU with BM3D denoise noise using GPU with BM3D 3 arguments cpu_img with CPU resource: denoise noise using CPU with TVL1 denoise noise using CPU with TVL1 denoise noise using CPU with TVL1 denoise noise using CPU with TVL1 denoise noise using CPU with TVL1 denoise noise using CPU with TVL1 cpu_img with GPU resource: denoise noise using GPU with TVL1 denoise noise using GPU with TVL1 denoise noise using GPU with TVL1 denoise noise using GPU with TVL1 denoise noise using GPU with TVL1 denoise noise using GPU with TVL1 gpu_img with CPU resource: denoise noise using CPU with TVL1 denoise noise using CPU with TVL1 denoise noise using CPU with TVL1 denoise noise using CPU with TVL1 denoise noise using CPU with TVL1 denoise noise using CPU with TVL1 gpu_img with GPU resource: denoise noise using GPU with TVL1 denoise noise using GPU with TVL1 denoise noise using GPU with TVL1 denoise noise using GPU with TVL1 denoise noise using GPU with TVL1 denoise noise using GPU with TVL1 ```