Open CarloLucibello opened 3 years ago
Would compatible with CuArrays
create a dependency on CUDA
?
Typically no, CuArray
s just follow abstract array interface, although scalar indexing leads to horrible performance.
To address such cases a cuda kernel is needed (and the CUDA.jl dependence)
Just having completed the upsampling code: I would now rather use KernelAbstractions.jl for such work, as it saves you from writing almost the same code twice. Plus I think it's a much lighter dependency than CUDA. In the end I imagine having super portable (CPU, GPU (Nvidia & AMD!)) code with gradients, which can be recycled in julia math, images and nnlib. It could even be used to write specialized methods for different dimension orderings and dispatch via NamedDims.
Related to the gradient computation, it would be helpful to have a rrule
and frule
function defined, package ChainRulesCore, to be able to use Interpolations
with AD, e.g. Zygote
. A gradient function is already defined. Thus the only part missing that needs to be implemented is a gradient with respect to the fields of Interpolations
objects, if I understand it correctly, see here
Could you point to the gradient function?
I don't know too much about the code, but simple debugging tells me there are 6 gradient functions
These have special definitions
These call the general gradient functions
I suspect the indexing.jl one is the one which needs to be used here.
Now that #414 is finished, how do we rewrite the upsampling functions? Going through the NNLib code, it seems like now that we have chainrules integration we can just rewrite stuff like upsample_nearest and its gradient. If its just that, I will open a PR their.
Have you taken a look at http://juliamath.github.io/Interpolations.jl/latest/devdocs/ yet?
I have read that page and I think I get it, a bit. But I was wondering if we should rather write them as simple functions using interpolation objects. Like the upsample_nearest is something like
itp = interpolate(x,(NoInterp(),NoInterp()))
We already have parent axes, which can be used to determine size.
Would it work similarly to the kernels we have now in NNlib? Do we get the correct gradients out etc?
Worth doing a comparison imo
I dont know what the kernel system is in NNLib (not familiar with that codebase at all). But I was thinking I can replace the code for upsample_nearest and its gradient function to the Zygote call. They have tests for these functions so rewriting should be easier.
Hi! @DhairyaLGandhi this serves as a reply to https://github.com/FluxML/NNlibCUDA.jl/pull/5#issuecomment-828444750, but I think it might be better to keep the discussion in one place :)
Yes, I would find it cool if it works and we can get rid of duplicate code (of which I really wrote a lot, sorry for that). But I don't know if the performance can match the handwritten kernels I ported from pytorch (at least for now). For example I benchmarked the CPU implementation of bilinear upsampling against imresize!
in ImageTransformations.jl (which uses interpolations.jl under the hood) and NNlib was 2.2x faster, even single threaded and with sub-optimal memory layout. Furthermore the ImageTransformations.jl code doesn't work on gpus when scalar getindex is disallowed. So I don't really know what the best way forward is. Maybe to enhance interpolations.jl and then use it? But how should all the specialized GPU code be handled, like the setup for the kernel calls (setting thread & block size)? Create CUDAinterpolations.jl so that interpolations.jl doesn't have to depend on CUDA? Or simply proceed with collecting specialized code for neural networks / dl in NNlib and just pick the stuff which really makes things faster and simpler from other packages? Note that at least for upsampling / interpolations things can be simplified a lot by smarter dispatch, which I can maybe hand in two PRs down the line or so.
@rick2047
I dont know what the kernel system is in NNLib
I don't know either :smile: I just ported pytorch code brute force and it turned out to be quite fast, but also a bit unwieldy, which can probably be improved by properly leveraging the dispatch. The nearest neighbour part was written by @mcabott.
Lastly, maybe someone else can also benchmark the interpolations to check my results?
Pretty confused as to the current state of this. I'm trying to run the same example from #414 but have the following error
using Interpolations
using Zygote
y = sin.(1:10)
itp = interpolate(y,BSpline(Cubic(Reflect(OnCell()))))
Zygote.gradient(itp, 2)
ERROR: ArgumentError: unable to check bounds for indices of type Interpolations.WeightedAdjIndex{4, Ratios.SimpleRatio{Int64}}
Did something change in this package or Zygote?
Ah I see, that merge hasn't made it to a release yet. Disregard me.
Sorry to revive this - but the above example does not work when I run it despite the previous merge. Further, when I run the package tests in a fresh julia installation
(@v1.7) pkg> add Interpolations
(@v1.7) pkg> test Interpolations
on the latest version (0.13.4) I find the error:
ChainRulesCore: Error During Test at /Users/jsull/.julia/packages/Interpolations/3gTQB/test/chainrules.jl:12
Test threw exception
Expression: (Zygote.gradient(itp, 1))[1] == Interpolations.gradient(itp, 1)
MethodError: no method matching (::ChainRulesCore.ProjectTo{Float64, NamedTuple{(), Tuple{}}})(::SVector{1, Float64})
Closest candidates are:
(::ChainRulesCore.ProjectTo{T})(::ChainRulesCore.AbstractZero) where T at ~/.julia/packages/ChainRulesCore/sHMAp/src/projection.jl:120
(::ChainRulesCore.ProjectTo{<:Number})(::ChainRulesCore.Tangent{<:Number}) at ~/.julia/packages/ChainRulesCore/sHMAp/src/projection.jl:192
(::ChainRulesCore.ProjectTo{T})(::ChainRulesCore.Tangent{<:T}) where T at ~/.julia/packages/ChainRulesCore/sHMAp/src/projection.jl:142
...
for the test related to the previously discussed Zygote gradients issue. It seems like the problem is ChainRulesCore?
(I also find a many-times repeated error when running the tests in test/gradient.jl:216, but the place for discussing this is probably another issue.)
Have you tried this on 1.6? There are known inference issues on 1.7
The example from above reproduces this error.
julia> using Interpolations, Zygote
julia> y = sin.(1:10);
julia> itp = interpolate(y,BSpline(Cubic(Reflect(OnCell()))));
julia> z, back = Zygote.pullback(itp, 2.0);
julia> z
0.769963450028415
julia> back(1.0)
([-0.35017548837401463],)
julia> Zygote.gradient(itp, 2.0);
ERROR: MethodError: no method matching (::ChainRulesCore.ProjectTo{Float64, NamedTuple{(), Tuple{}}})(::StaticArrays.SVector{1, Float64})
(jl_hY83L3) pkg> st
Status `/private/var/folders/yq/4p2zwd614y59gszh7y9ypyhh0000gn/T/jl_hY83L3/Project.toml`
[a98d9a8b] Interpolations v0.13.4
[e88e6eb3] Zygote v0.6.32
The error is coming from ChainRulesCore, but only because it's performing a sanity-check on the gradient returned by the rule. Above, back(1.0)
avoids this, and it produces a 1-element vector as the gradient of a number, which doesn't make sense.
Where the bug is, or whether this used to work, I don't know.
Oh, the problem is just that there has not been a release since #465. On master this works.
Just switching to master worked for me - thanks for pointing that out and for the explanation!
I just released a new version 0.13.5
I'd love to have the gradient wrt to the original array as well, and at that point I believe we could close this. Is there some interest in putting together a PR to that end?
@DhairyaLGandhi, me too, but implementing that seems quite a bit harder, since it has to be done separately for each type of interpolation. One simple case that might be a good starting point would be to support linear gridded interpolation.
@DhairyaLGandhi @stevengj I would also love to have the gradient with respect to the data being interpolated. Especially for cubic splines, because I need smoothness... The problem is that I have no clue where to start implementing it. Does the interpolation involve solving equations, or are there some constant weights pre-computed? Also there is very little online about how cubic spline interpolation works in 2D or 3D. Any pointer to resources would be helpful. Is there any other package that could do this?
@DhairyaLGandhi @stevengj : Gradient with respect to the image is after all not that complicated. The interpolation linearly depends on the image, so this gradient is in fact independent of the image.
Internally, the gradient is calculated by the weightedindexes
function and then multiplied with the image data. This has a sparse format, because the interpolation result depends only locally on the image. This is the same internal API all throughout the various interpolations. The only task is to convert the tuple of 'WeightedAdjIndex' to a digestible format, which I can do.
I would be happy if 'weightedindexes' is exported as an API and then I could use it as the gradient. Could someone make that happen?
Hi, this is to enquire about the possibility of using Interpolations.jl to build image upsampling or downsampling layers in Flux.jl. We recently added a bilinear upsampling function https://github.com/FluxML/NNlib.jl/pull/262, but I was wondering if we could leverage instead some of the code here. The requisites would be
I'm not familiar with the codebase here, maybe this is a long shot, but worth making an attempt and creating awareness about the possibility of this kind of interaction. Moreover gpu and automatic diff friendly interpolations would generally benefit the ML ecosystem.
Best, CL
cc @johnnychen94