Progress of Wrapping APIs of ForwardDiff

Non-Contradiction commented 6 years ago

From http://www.juliadiff.org/ForwardDiff.jl/stable/user/api.html

Note that since common R native functions are not supposed to modify the content of its input, so methods in ForwardDiff.jl api with f! (which is a common notation in julia that the function is supposed to modify the content of its input) are omitted here. For similar reasons, method like derivative! will also not be wrapped. (??)

Derivatives of `f(x::Real)::Union{Real,AbstractArray}`

[x] ForwardDiff.derivative(f, x::Real) Return df/dx evaluated at x, assuming f is called as f(x). This method assumes that isa(f(x), Union{Real,AbstractArray}).
[x] ForwardDiff.derivative!(result::Union{AbstractArray,DiffResult}, f, x::Real) Compute df/dx evaluated at x and store the result(s) in result, assuming f is called as f(x). This method assumes that isa(f(x), Union{Real,AbstractArray}).

Gradients of `f(x::AbstractArray)::Real`

[x] ForwardDiff.gradient(f, x::AbstractArray, cfg::GradientConfig = GradientConfig(f, x), check=Val{true}()) Return ∇f evaluated at x, assuming f is called as f(x). This method assumes that isa(f(x), Real). Set check to Val{false}() to disable tag checking. This can lead to perturbation confusion, so should be used with care.
[x] ForwardDiff.gradient!(result::Union{AbstractArray,DiffResult}, f, x::AbstractArray, cfg::GradientConfig = GradientConfig(f, x), check=Val{true}()) Compute ∇f evaluated at x and store the result(s) in result, assuming f is called as f(x). This method assumes that isa(f(x), Real).

Jacobians of `f(x::AbstractArray)::AbstractArray`

[x] ForwardDiff.jacobian(f, x::AbstractArray, cfg::JacobianConfig = JacobianConfig(f, x), check=Val{true}()) Return J(f) evaluated at x, assuming f is called as f(x). This method assumes that isa(f(x), AbstractArray). Set check to Val{false}() to disable tag checking. This can lead to perturbation confusion, so should be used with care.
[x] ForwardDiff.jacobian!(result::Union{AbstractArray,DiffResult}, f, x::AbstractArray, cfg::JacobianConfig = JacobianConfig(f, x), check=Val{true}()) Compute J(f) evaluated at x and store the result(s) in result, assuming f is called as f(x). This method assumes that isa(f(x), AbstractArray). Set check to Val{false}() to disable tag checking. This can lead to perturbation confusion, so should be used with care.

Hessians of `f(x::AbstractArray)::Real`

[x] ForwardDiff.hessian(f, x::AbstractArray, cfg::HessianConfig = HessianConfig(f, x), check=Val{true}()) Return H(f) (i.e. J(∇(f))) evaluated at x, assuming f is called as f(x). This method assumes that isa(f(x), Real). Set check to Val{false}() to disable tag checking. This can lead to perturbation confusion, so should be used with care.
[x] ForwardDiff.hessian!(result::AbstractArray, f, x::AbstractArray, cfg::HessianConfig = HessianConfig(f, x), check=Val{true}()) Compute H(f) (i.e. J(∇(f))) evaluated at x and store the result(s) in result, assuming f is called as f(x). This method assumes that isa(f(x), Real). Set check to Val{false}() to disable tag checking. This can lead to perturbation confusion, so should be used with care.
[x] ForwardDiff.hessian!(result::DiffResult, f, x::AbstractArray, cfg::HessianConfig = HessianConfig(f, result, x), check=Val{true}()) Exactly like ForwardDiff.hessian!(result::AbstractArray, f, x::AbstractArray, cfg::HessianConfig), but because isa(result, DiffResult), cfg is constructed as HessianConfig(f, result, x) instead of HessianConfig(f, x). Set check to Val{false}() to disable tag checking. This can lead to perturbation confusion, so should be used with care.

Preallocating/Configuring Work Buffers

For the sake of convenience and performance, all "extra" information used by ForwardDiff's API methods is bundled up in the ForwardDiff.AbstractConfig family of types. These types allow the user to easily feed several different parameters to ForwardDiff's API methods, such as chunk size, work buffers, and perturbation seed configurations.

ForwardDiff's basic API methods will allocate these types automatically by default, but you can drastically reduce memory usage if you preallocate them yourself.

Note that for all constructors below, the chunk size N may be explicitly provided, or omitted, in which case ForwardDiff will automatically select a chunk size for you. However, it is highly recommended to specify the chunk size manually when possible (see Configuring Chunk Size).

Note also that configurations constructed for a specific function f cannot be reused to differentiate other functions (though can be reused to differentiate f at different values). To construct a configuration which can be reused to differentiate any function, you can pass nothing as the function argument. While this is more flexible, it decreases ForwardDiff's ability to catch and prevent perturbation confusion.

[x] ForwardDiff.GradientConfig(f, x::AbstractArray, chunk::Chunk = Chunk(x)) Return a GradientConfig instance based on the type of f and type/shape of the input vector x. The returned GradientConfig instance contains all the work buffers required by ForwardDiff.gradient and ForwardDiff.gradient!. If f is nothing instead of the actual target function, then the returned instance can be used with any target function. However, this will reduce ForwardDiff's ability to catch and prevent perturbation confusion (see https://github.com/JuliaDiff/ForwardDiff.jl/issues/83). This constructor does not store/modify x.
[x] ForwardDiff.JacobianConfig(f, x::AbstractArray, chunk::Chunk = Chunk(x)) Return a JacobianConfig instance based on the type of f and type/shape of the input vector x. The returned JacobianConfig instance contains all the work buffers required by ForwardDiff.jacobian and ForwardDiff.jacobian! when the target function takes the form f(x). If f is nothing instead of the actual target function, then the returned instance can be used with any target function. However, this will reduce ForwardDiff's ability to catch and prevent perturbation confusion (see https://github.com/JuliaDiff/ForwardDiff.jl/issues/83). This constructor does not store/modify x.
[x] ForwardDiff.HessianConfig(f, x::AbstractArray, chunk::Chunk = Chunk(x)) Return a HessianConfig instance based on the type of f and type/shape of the input vector x. The returned HessianConfig instance contains all the work buffers required by ForwardDiff.hessian and ForwardDiff.hessian!. For the latter, the buffers are configured for the case where the result argument is an AbstractArray. If it is a DiffResult, the HessianConfig should instead be constructed via ForwardDiff.HessianConfig(f, result, x, chunk). If f is nothing instead of the actual target function, then the returned instance can be used with any target function. However, this will reduce ForwardDiff's ability to catch and prevent perturbation confusion (see https://github.com/JuliaDiff/ForwardDiff.jl/issues/83). This constructor does not store/modify x.
[x] ForwardDiff.HessianConfig(f, result::DiffResult, x::AbstractArray, chunk::Chunk = Chunk(x)) Return a HessianConfig instance based on the type of f, types/storage in result, and type/shape of the input vector x. The returned HessianConfig instance contains all the work buffers required by ForwardDiff.hessian! for the case where the result argument is an DiffResult. If f is nothing instead of the actual target function, then the returned instance can be used with any target function. However, this will reduce ForwardDiff's ability to catch and prevent perturbation confusion (see https://github.com/JuliaDiff/ForwardDiff.jl/issues/83). This constructor does not store/modify x.

Update

Since we have finished all the basic implementations. The next step will be:

[x] Export the wrapper functions.
[x] Finish the documentation.
[x] Add tests.

Non-Contradiction commented 6 years ago

Note that the ticks at

ForwardDiff.derivative!
ForwardDiff.gradient!
ForwardDiff.jacobian!
ForwardDiff.hessian!
ForwardDiff.HessianConfig(f, result::DiffResult, x::AbstractArray, chunk::Chunk = Chunk(x))

are because of the "mutating" problems. I am still thinking on the problem, and currently wrappers for these will not be implemented.

Non-Contradiction commented 6 years ago

Finish "all" the basic implementation of wrappers of the ForwardDiff API. Next need to have exported them and have documentations and tests.

Non-Contradiction commented 6 years ago

Finish exportation and documentation of wrapper functions. Add some basic tests for wrapper functions. Need to adopt tests at https://github.com/JuliaDiff/ForwardDiff.jl/tree/master/test https://github.com/JuliaDiff/DiffTests.jl/blob/master/src/DiffTests.jl to close this issue.

randy3k commented 6 years ago

Hi there,

Please forgive me if my question may be a stupid one. My understanding is that forward mode differentiation requires the evaluations of dual numbers. DiffRules provides such results for julia. How this suppose to work with R functions?

Non-Contradiction commented 6 years ago

Currently in JuliaCall the julia types like Dual number are just wrapped in a simple wrapper JuliaObject, and in R, most of calculations on the wrapper JuliaObject will be dispatched back to Julia. So it's just like to deal with the julia function, but the overhead is definitely heavier than the original julia case.

randy3k commented 6 years ago

So it won’t work with R functions exported to Julia? (I haven’t tried the package to see if it works)

Non-Contradiction commented 6 years ago

I think it won't work with R functions exported to Julia except JuliaCall is loaded or the whole function on Dual numbers are defined. I will try it.

Non-Contradiction commented 6 years ago

It doesn't work with JuliaCall loaded. But loading JuliaCall actually helps.

randy3k commented 6 years ago

One solution that I could think of is to inspect the body of an R function body(f) and translate it to a Julia function.

Non-Contradiction commented 6 years ago

Yes, this should work, and it also should have performance advantage compared to the current method, but it also requires lots of work, considering that the function could call other functions... It feels like doing some kind of "compiler" work, "compiling" R to Julia...

Non-Contradiction commented 6 years ago

For example, there are some techniques that allow "compiling" a subset of R code (and the R code needs to follow certain rules) to cpp functions, and then should have better performance compared to R, and also can use with other cpp packages like some cpp package for automatic differentiation.

randy3k commented 6 years ago

I believe we will need to do similar operations (at least for simple pure R functions). But I agree that it’s not simple. Some code in that direction can be found here. Basically we need to translate R expressions to Julia expressions.

Non-Contradiction commented 6 years ago

I think one of the issue that even loading JuliaCall can't make ForwardDiff works for R functions exported to Julia is caused by asymmetry between R functions exported to Julia and Julia functions exported to R.

If you call Julia functions exported to R with R objects, you still get R objects.

JuliaCall::julia_eval("f(x) = x^2")(2)
# 4

But if you call R functions exported to Julia with Julia objects, you get wrapped R objects.

julia> using RCall

julia> R"function(x)x^2"(2.0)
## RCall.RObject{RCall.RealSxp}
## [1] 4

By the way, it seems buggy to using RCall and load JuliaCall in embedded R, maybe some problems on std output.... The last time I check this is a rather long time ago which worked okay... It seems that in setup of JuliaCall I need to reduce some steps related to display (maybe?) if julia is already there.

randy3k commented 6 years ago

Actually the behavior depends if it is a Julia function or RObject.

julia> R"function(x)x^2"(2)
RCall.RObject{RCall.RealSxp}
[1] 4

julia> rcopy(R"function(x)x^2")(2)
4.0

I actually like the idea that julia_eval should always return a JuliaObject and do not convert the return implicitly. Unless we do something similar to reticulate::import which converts Julia functions to R functions.

Non-Contradiction commented 6 years ago

This makes sense. Then with the help of JuliaCall, ForwardDiff actually can deal with the R function exported to Julia, although a little weird....

julia> using RCall

julia> Rfunc = rcopy(R"function(x)x^2")
(::#33) (generic function with 1 method)

julia> R"library(JuliaCall)"
RCall.RObject{RCall.StrSxp}
[1] "JuliaCall" "stats"     "graphics"  "grDevices" "utils"     "datasets" 
[7] "methods"   "base"     

julia> R"julia_setup()"
WARNING: RCall.jl: Julia version 0.6.2 at location /Applications/Julia-0.6.app/Contents/Resources/julia/bin will be used.
Loading setup script for JuliaCall...
WARNING: RCall.jl: Finish loading setup script for JuliaCall.

julia> using ForwardDiff

julia> ForwardDiff.derivative(Rfunc, 2.0)
RCall.RObject{RCall.EnvSxp}
<environment: 0x7fdecda00ab8>

julia> rcopy(ForwardDiff.derivative(Rfunc, 2.0))
4.0

Somehow the returned result is a JuliaObject wrapper, so rcopy is needed.

randy3k commented 6 years ago

Oh, I originally thought that ForwardDiff only works for Julia defined functions because the use of DiffRules. It seems that it has some magical ways to make Rfun understands Dual.

Non-Contradiction commented 6 years ago

This is by the aforementioned JuliaObject trick: Dual is wrapped in JuliaObject and when R wants to carry out operations like + - * exp ... on JuliaObject, it will dispatch the operator to Julia side and then DiffRules will work as with Julia functions.

It's a little like "interpretation" v.s. "compilation" compared to your idea above that transforms R functions to Julia ones. Instead of transforming the expression all at once, it will dispatch Julia operators when necessary.

I think it is somewhat similar to Dual's approach for automatic differentiation, it dispatches the method it need for the corresponding basic operators, v.s. the "source code transformation" method which takes the whole expression and output the differentiation expression.

Although in ForwardDiff, the dispatch for Dual can be efficient because of amazing magic in Julia, dispatch for JuliaObject in R is not that efficient, and it will carry in-ignorable overhead.

randy3k commented 6 years ago

Thanks for the explanation. I have overlooked your comment earlier. I like your idea of wrapping JuliaObject's and dispatching operations to them. Perhaps we should merge some of the related code to RCall.

Did you ever benchmark the difference between the wrapper and the pure Julia implementation? I would like to know the amount of overhead.

randy3k commented 6 years ago

I just notice that the returned value is an environment rather than a numeric. Is it actually the JuliaObject? It seems the automatic conversion is bugged somewhere. The returned value should be rcopy'ed.

Non-Contradiction commented 6 years ago

Merging some of the JuliaObject related code to RCall would be cool. The most relevant ones are JuliaObject.R and JuliaObject.jl, and some related dispatching functions are scattered around...

I haven't done any serious benchmarking. I will try to measure the overhead using JuliaCall.

Yes, the environment is in fact the JuliaObject, where it stores the index of the actual object in a dictionary. The implementation now is not that sophisticated and in fact lacks some important features.

Non-Contradiction commented 6 years ago

And I just noticed some interesting thing. ~~The problem only occurs at the first time... Maybe some world age issue?~~ Update: the problem in fact doesn't exist at all. The returned object is always the number, but somehow the JuliaObject which is an environment gets displayed.

julia> using RCall

julia> Rfunc = rcopy(R"function(x)x^2")
(::#33) (generic function with 1 method)

julia> R"library(JuliaCall)"
RCall.RObject{RCall.StrSxp}
[1] "JuliaCall" "stats"     "graphics"  "grDevices" "utils"     "datasets" 
[7] "methods"   "base"     

julia> R"julia_setup()"
WARNING: RCall.jl: Julia version 0.6.2 at location /Applications/Julia-0.6.app/Contents/Resources/julia/bin will be used.
Loading setup script for JuliaCall...
WARNING: RCall.jl: Finish loading setup script for JuliaCall.

julia> using ForwardDiff

julia> ForwardDiff.derivative(Rfunc, 2.0)
RCall.RObject{RCall.EnvSxp}
<environment: 0x7fe6560b70a8>

julia> ForwardDiff.derivative(Rfunc, 2.0)
4.0

randy3k commented 6 years ago

It seems that JuliaCall has captured the output.

julia> using RCall

julia> R"library(JuliaCall)";

julia> R"julia_setup()";
WARNING: RCall.jl: Julia version 0.6.2 at location /Applications/Julia-0.6.app/Contents/Resources/julia/bin will be used.
Julia initiation...
Finish Julia initiation.
Loading setup script for JuliaCall...
WARNING: RCall.jl: Finish loading setup script for JuliaCall.

julia> a = 1

julia> a  # print nothing

julia>

Non-Contradiction commented 6 years ago

Yes, JuliaCall brings a display system itself to capture the Julia display, but it seems to be problematic when using in Julia and RCall. I open a new issue at https://github.com/Non-Contradiction/JuliaCall/issues/54 to track the problem. I think it boils down to check whether Julia is embedded or not. @randy3k Do you have any idea?

Non-Contradiction / autodiffr