Evizero / Augmentor.jl

A fast image augmentation library in Julia for machine learning.
https://evizero.github.io/Augmentor.jl/
Other
137 stars 48 forks source link

Does Augmentor.jl support multispectral images? #61

Open AlfTetzlaff opened 4 years ago

AlfTetzlaff commented 4 years ago

Hi, I am frequently working with multispectral images (say 4-8 bands) and need to augment them. These usually come in the form of 3D arrays. In the source code I saw that the augmentation functions require img::AbstractMatrix - so at the first glance I cannot plug in plain arrays. Did I overlook something? Or how can I mangle my arrays into a form that can be processed? Tell me if you think this is better moved to discourse :)

johnnychen94 commented 4 years ago

If each slice is processed separatly as if it's a 2d image, then you could do this:

reduce((x,y)->cat(x,y; dims=3),
       augment.(eachslice(img, dims=3), Ref(FlipX())))

Ref here is used to escape broadcasting as if it is a scalar.

I'll find some time to support this when I come back working on this package. Was too busy this semester.

AlfTetzlaff commented 4 years ago

Hi, thank you for the help! In the meantime I also came up with a solution, namely reinterpreting to SVector instead of some RGB:

using StaticArrays
img = reinterpret(SVector{N,T}, raw_img) # N is the number of image bands, T the type.

However it feels limiting always having to reinterpret stuff. Most DL libraries need image data in the form NCHW or NHWC (N: batchsize, C: number of channels). It would be convenient to just plug in an array or batch and to specify the channel axis. Theoretically it should also be possible to handle an arbitrary number of dimensions to support e.g. 3D images; julia's metaprogramming capabilities should allow this. However I'm not an expert in that.

johnnychen94 commented 4 years ago

namely reinterpreting to SVector instead of some RGB:

I'm not sure if I understand this, can you share a complete example?

It would be convenient to just plug in an array or batch and to specify the channel axis.

There's an "experimental" package https://github.com/JuliaImages/ImageAxes.jl based on https://github.com/JuliaArrays/AxisArrays.jl However, making such work seamlessly requires a lot more efforts than we have recently. So this becomes a low priority work for us.

AlfTetzlaff commented 4 years ago

I'm not sure if I understand this, can you share a complete example?

Is the one from #62 enough?

There's an "experimental" package https://github.com/JuliaImages/ImageAxes.jl based on https://github.com/JuliaArrays/AxisArrays.jl However, making such work seamlessly requires a lot more efforts than we have recently. So this becomes a low priority work for us.

That's a pity. I'll try and hack around a bit and in case I arrive at something useful I'll come back to you.

timholy commented 4 years ago

I've always imagined that multispectral images would be handled with custom "color" types.

struct Hyper8{T} <: Color{T,8}
    uvc::T
    uvb::T
    uva::T
    violet::T
    ...
end

although you could alternatively define it as

struct Hyper{T,N} <: Color{T,N}
    channels::NTuple{N,T}
end

It's essentially the same as the SVector approach but perhaps more united with the rest of JuliaImages.

@AlfTetzlaff, the reason that the "color axis as an axis of the array" isn't my favorite approach is that it makes it extraordinarily difficult to write generic code. JuliaImages is probably the best suite out there for uniting 2D, 3D, 2D+time, and 3D+time images: basically everything in JuliaImages works seamlessly on all of them. Color channels make that much more difficult, because they break a key abstraction: one array element is one pixel. JuliaImages started out not insisting on that abstraction, and things got really interesting and productive when we changed to the current structure.

AlfTetzlaff commented 4 years ago

Thanks for the reply! I will try to get going using a custom color type - maybe it turns out to be much easier than with plain arrays. I have to see how the conversions to rgb for plotting works and also which dim ordering is fastest for Flux & CuArrays in the end; I certainly have something to learn here.

If you consider the headline question as answered, feel free to close :-)

timholy commented 4 years ago

We definitely should make more efforts to unite the ML world with JuliaImages. It's one of my priorities for the next six months.

AlfTetzlaff commented 4 years ago

Good to hear that somebody (more professional than me) takes care of that! Initially I thought that AxisArrays + a convention on using :n, :c, :w/x, :h/y, :z, :t would do the job, but maybe you have even better ideas :)

timholy commented 4 years ago

That can definitely help (and indeed is key to our handling of, say, time), but the issue is that then algorithms have to be written specially for the case when there is a channel dimension. When one array element == one pixel, then it's handled automatically by dispatch on the element type (which can be numeric, grayscale, or color) and the algorithm itself only rarely needs to be customized.

If there are lots of factors you have to consider, it's easy to miss one or more of them. I don't know OpenCV well at all, but I've heard from those who do a mixture of admiration (for how much stuff it has in it) and frustration (for how many "holes" there are in coverage of many different image types for many different algorithms). We have far less stuff than OpenCV but far more consistent coverage.