JuliaImages / juliaimages.github.io

Documentation For JuliaImages
https://juliaimages.org
33 stars 56 forks source link

Towards consistent style, part 1: a naming guide #229

Open timholy opened 5 years ago

timholy commented 5 years ago

JuliaImages has grown over several years through the efforts of many people; now that Julia is itself stable, I anticipate new waves of growth in the coming years. To help ensure its health and vibrancy, I think we need to establish some guidelines that will help give such a large, distributed project a unified feel. To me, step 1 of this process is to develop a naming guide and to rename some of our functions or objects accordingly. The sooner we do this the better.

A naming guide should not be arbitrary, but be based on principles. I propose we use this issue to hash out the principles. Here are a few thoughts to get the discussion rolling:

More examples of current names that are consistent with these guidelines:

Some examples of current names that would change due to these guidelines:

Tokazama commented 5 years ago

Should all functions that are only meant to support higher level API begin with an underscore?

Tokazama commented 5 years ago

Should indices_spatial or timeaxis be renamed since they do something similar but for space or time? Something like indices_spatial->spatialaxes or timeaxis -> indices_time.

johnnychen94 commented 5 years ago
zygmuntszpak commented 5 years ago

When dealing with Mathematical Morphology operations we could define

abstract type MorphologicalOperation end

struct Erosion <: MorphologicalOperation end
struct Dilation <: MorphologicalOperation end
struct Opening <: MorphologicalOperation end
struct Closing <: MorphologicalOperation end
struct BottomHat <: MorphologicalOperation end
struct TopHat <: MorphologicalOperation end
struct Laplacian <: MorphologicalOperation end
struct HitAndMiss <: MorphologicalOperation end

and then dispatch on analyze_morphology(::MorphologicalOperation(), img, ...).

This is analogous to what we are starting to do with our histogram operations, i.e. adjust_histogram(...) and, in general, I think we could refactor a lot of the codebase to follow this pattern to reduce the vocabulary of different function names.

timholy commented 5 years ago

Worth pointing out that most people seem to prefer sum(A) to reduce(+, A). Nevertheless, I agree that might be the best solution.

zygmuntszpak commented 5 years ago

I would count myself among the group that would prefer sum(A) to reduce(+,A). I will open another issue where I will go over the current API and propose some alternatives. We could then get community feedback and hopefully strike the right balance.

johnnychen94 commented 5 years ago

As a quick note, I don't mean to vote for im*

Drawbacks of "start with a verb" convention:

good point of im* naming style:


Is there any reference saying that we must use the "start with a verb" convention?

Python packages uses the package name as the prefix to TAB (e.g., plt.TAB), and they have _* and __* naming styles to filter out those package details. So they don't need another im prefix.

But for Julia we don't necessarily do using Images: imrotate or Images.imrotate, just using Images and imTAB gives us all we want as a user

I think im* prefix for those exported methods is important to users. -- Images.TAB get a list of everything.

A list of names with im* prefix -- thinking of im* as a variant of *image but with TAB completion supported, e.g.,:


one reason for not using im* prefix is that images are just arrays. But since images are just arrays, we can argue in the same way that *image suffix is not good naming practice, and in the same way that showimage should be show.

johnnychen94 commented 5 years ago

Which names should be taken?

add noise to an image:

update the image tracker status:

apply_noise looks like redundant since the type n::Noise already indicates this. For this reason, I prefer apply and update!

Deepank308 commented 5 years ago

I think that apply(n::noise, img) and update!(tracker::Tracker, img) are better. This removes redundancy and code will look a lot cleaner. Also, when read aloud - "Apply this noise to this image" and "Update this tracker with this image" sounds good as pointed out here.

So as general convention when we have a function with img and algorithm as its arguments, some_function_name(img, algorithm) sounds good. While if we happen to have img and some_struct as the arguments, some_function(some_struct, img) sounds better.

Just an idea, please guide me if I'm going wrong.

johnnychen94 commented 5 years ago

So as general convention when we have a function with img and algorithm as its arguments, some_function_name(img, algorithm) sounds good. While if we happen to have img and some_struct as the arguments, some_function(some_struct, img) sounds better.

Can you explain this? what are algorithm and some_struct here and why they make a difference in the argument order?

Deepank308 commented 5 years ago

For example :

I think I got it wrong, the agrument order for consistent reading depends more on the function name rather than algo or some_struct

johnnychen94 commented 5 years ago

My understanding is:

They're quite different here. Hypothetically, if there're many methods applying noise:

all these examples still follow the where-what-how pattern

Deepank308 commented 5 years ago

Ok, I got what you are trying to say. But apply!(n::Noise, img) also follows the same pattern and is more clear to read as "apply noise n to image" as opposed to apply!(img, n::noise)

johnnychen94 commented 5 years ago

yes, I have to admit that apply(n::Noise, img) reads better than apply(img, n::Noise)

But this case is quite special in the sense that we can support in-place modification, i.e., apply!(img, n::Noise), where conventionally the first argument will be modified. Not supporting apply(img, n::Noise) will surprise users.

Considering the existence of in-place modification methods, we might not follow the where-what-how pattern; we need follow what-how-where or what-where-how pattern.

Tokazama commented 5 years ago

Recent ideas about a shared functions API may be worth considering for solidifying naming (brought up here and most recently implemented here.

Pros:

Cons:

timholy commented 5 years ago

Isn't that focused on how you handle the problem of several different frameworks needing to collaborate? AFAICT JuliaImages is the one-and-only contender for native image processing in Julia, though I could be wrong. Or were there naming guidelines in that thread? (Sorry, I just looked at the link, the thread as a whole is very long.)

Tokazama commented 5 years ago

I believe the original motivation was something with JuliaData and JuliaStats sharing a lot of functions, but the actual problem being addressed is any sort of shared function naming between packages. We already have this in a lot of ways in ImageCore, ImageMetadata, and ImageAxes. It would seem a lot "cleaner" to have one package that implements these and is the focus of API discussions. Perhaps the easiest way is to just move a couple things over to ImageCore and make it the central hub for all API.

However, I think the ability to work with non-image focused communities is what makes this idea most appealing. Specifically, I would love it if there was a common interface between ImageFiltering and NNlib (kind of like Flux's version of ImageFiltering). A lot of work has gone into both and I think this may be an easy start to introducing compatibility between the two. I'm sure there are other similar situations to this that would be beneficial to multiple communities.

johnnychen94 commented 5 years ago

One use case:

I was keeping thinking of the possibilities of including/incorporating deep learning related methods into/with JuliaImages. They are implemented quite different from the current codebase, but they can share the same API.

We can hold something like ImagesAPI.jl in JuliaImages I guess.

timholy commented 5 years ago

To my thinking, that's basically what ImageCore is. It provides a lot of stub traits that then get fleshed out by ImageAxes and ImageMetadata. Of course it also provides colorview and channelview functionality, and maybe you're arguing that's too much for one package?

johnnychen94 commented 5 years ago

Some further explanation. I think it's quite different from the purpose of ImageCore.jl, it helps connects different packages together by providing a unified API.

I can imagine a set of APIs such as

enhance!(::GenericImage, ::Algorithm [, ::ComputingResource])
enhance([::Type, ] ::GenericImage, ::Algorithm [, ::ComputingResource])

restore!(::GenericImage, ::Algorithm [, ::ComputingResource])
restore([::Type, ] ::GenericImage, ::Algorithm [, ::ComputingResource])

binarize!(::GenericImage, ::BinarizationAlgorithm [, ::ComputingResource])
binarize([::Type, ] ::GenericImage, ::BinarizationAlgorithm [, ::ComputingResource])

denoise!(::GenericImage, ::NoiseReductionAlgorithm [, ::ComputingResource])
denoise([::Type, ] ::GenericImage, ::NoiseReductionAlgorithm [, ::ComputingResource])

Assume there are DenoisingConvolutionalNeuralNetworks.jl and DenosingGenerativeAdversarialNetworks.jl packages, as long as they adopt the same denoise API, they can work seamlessly with ImageDenoise.jl

Another promising feature here is both enhance and binarize can share the same implementation. enhance, restore and denoise can share the same implementation as well. After all noise removal is just one image restoration/enhancement method.

Tokazama commented 5 years ago

I think the stub traits would make the bulk of what would go in something like ImagesAPI.jl. But I think things like timedim from ImageAxes and potentially even properties from ImageMetadata could be added (which as far as I understand it are currently not inherited from ImageCore).

I'm not as concerned about the amount of functionality in one package as much as I am about simplifying interoperability with other ecosystems and easing maintenance. I mention the use of NNlib because I think it's a good example of where a package may not want to even depend on ImageCore but it would be mutually beneficial to share an API.

Tokazama commented 5 years ago

@johnnychen94 , thanks for providing those examples. That's pretty much what I was thinking of. Although, I think there are cases where we could make it as simple as

function binarize! end
timholy commented 5 years ago

Yes, ImageCore is essentially limited to traits, and deliberately avoids algorithms. If there is need for "sharing" algorithms across many packages, that sounds reasonable.

johnnychen94 commented 5 years ago

@timholy can you create such a repository under JuliaImages as an experimental place to see how this idea works in general? And we can move the detailed discussions there.

But I think we need further and comprehensive thoughts before starting this work.

timholy commented 5 years ago

Done: https://github.com/JuliaImages/ImagesAPI.jl

Tokazama commented 4 years ago

Would it be worth discussing how/if we should implement the BlueStyle in JuliaImages? I'd personally rather use something that others have already actively discussed and implemented than reinventing everything.

johnnychen94 commented 4 years ago

Since currently there aren't many contributors in JuliaImages. I think the focus of this issue is more on names as a part of APIs of JuliaImages.jl, instead of coding styles. For me, package-wise consistency would be acceptable.

timholy commented 4 years ago

Replying to https://github.com/JuliaImages/Images.jl/issues/767#issuecomment-475911504, which was posted at around the time I was wrapping up my work on the debugger and therefore not paying much attention.

Drawbacks of "start with a verb" convention:

  • images are just arrays, and some image-manipulate functions such as rotate resize will likely cause a conflict with other packages.

I think that's OK. Ideally they'd be consistent in what they do, and we can always scope by module.

  • if we want to avoid potential naming conflicts, we need to add image suffix to it, e.g., showimage, rotateimage, resizeimage. But it looks lengthy for these "trivial" functions. -- of course lengthy isn't a real problem.

I don't like adding image unnecessarily, especially since images are just arrays. But showimage has a special meaning: to display as an image. You can also show the same array as text using the normal array display machinery.

good point of im* naming style:

  • From a user's viewpoint, im* naming makes it easy to use TAB to filter out image-related functions.

That's the best thing about it. But since we have color types to indicate "this is an image," the line between arrays and images is otherwise blurry (and should be). The flip side of your argument is to look at how many people avoid using ImageFiltering just because of the "image" part of the name.

Is there any reference saying that we must use the "start with a verb" convention?

There's no must about it, but see:

Also linking https://github.com/JuliaLang/julia/issues/20402 as one of the best "cleanup" issues I know of.

Finally, if you just do

for n in names(Base)
    println(n)
end

I think you'll be impressed at what a large fraction of operations that compute something use a verb for their name. (Again, I'm not proposing a verb for everything, I'm saying it makes sense specifically for operations that compute something .)

Python packages uses the package name as the prefix to TAB (e.g., plt.TAB), and they have _* and __* naming styles to filter out those package details. So they don't need another im prefix.

As you probably know by now, that works in Julia too. ImageCore.[TAB][TAB] returns a lot of stuff now that we reexport FixedPointNumbers and Colors. (The first TAB is for completion-up-to-the-next-ambiguous-point, the second is "show me my options.")

But for Julia we don't necessarily do using Images: imrotate or Images.imrotate, just using Images and imTAB gives us all we want as a user

I think im* prefix for those exported methods is important to users. -- Images.TAB get a list of everything.

I think the im* prefix comes from a language (probably Matlab?) that lacks namespaces. If a new user wants to rotate an image, wouldn't rotate be the first thing to try?

Tokazama commented 4 years ago

I finally took some time to really evaluate my original comment concerning consistency with methods that refer to axes in some way. I've put together a small table with some proposed changes. The proposed changes are motivated by:

  1. Consistency in naming (name of dimension is always the first part of a method name)
  2. A more clear description of what the method does (e.g., axis should return the axis).
Where name is the label for a given dimension(s) (e.g., time or spatial) Current Syntax Current Behavior Proposed Syntax Proposed Behavior
namedim Returns Int corresponding to dimension with same name same same
nameaxis Returns keys of axis corresponding to namedim name_keys
indices_name Returns values of axis corresponding to namedim name_values or name_indices same
nothing nothing name_axis Return axis (e.g., axes(x, namedim(x))

There are some other relevant methods (like coords_spatial, size_spatial) but I thought these would be a good starting point.

Edit: I may be over analyzing this at the risk of just making unhelpful breaking changes. Just down vote if this is totally off base.

Tokazama commented 4 years ago

I've thrown together some of this here https://github.com/Tokazama/NamedIndicesMeta.jl in case people are interested. If people like the idea then I can polish it up and make a PR for the image related stuff.