JuliaDiff / ForwardDiff.jl

Forward Mode Automatic Differentiation for Julia
Other
887 stars 142 forks source link

Matrix-valued DualNumber: coding problems #5

Closed bsxfan closed 11 years ago

bsxfan commented 11 years ago

Hi All

I would love to contribute to the autodiff tools, but I have hit a hickup that is making me so despondent that I am seriously considering giving up on Julia for a while until it has matured.

I have spent a large effort on developing a dualnumber type (similar to my existing MATLAB implementation) that can take matrix-valued fields. This allows one get forward-mode AD to work smoothly through all kinds of matrix operations, which will be executed efficiently in BLAS/LAPACK. This has worked really well for me in MATLAB.

I have gone to the trouble of translating most of the functionality I have in MATLAB, but now I find I have a piece if code that I simply cannot debug.

If anybody is interested, my code is here: https://github.com/bsxfan/ad4julia/blob/master/DualNumbers.jl

There are certainly many bugs and missing features remaining in this code, but at the moment I am stuck at the cat() function, which overloads vertical and horizontal concatention of dualnumber matrices.

This function behaves really weirdly. It is supposed to return a DualNum object. It manages to construct that object and can display it, via show() inside my cat() function. But when I return that value, it has disappeared!

julia> require("DualNumbers.jl")
julia> using(DualNumbers)
julia> a = dualnum(1)
standard part: 1.0
differential part: 0.0

julia> b = dualnum(2)
standard part: 2.0
differential part: 0.0

julia> c = [a b]
here in cat
ST:
1x2 Float64 Array:
1.0  2.0
DI:
1x2 Float64 Array:
0.0  0.0
D:
standard part: 1x2 Float64 Array:
1.0  2.0
differential part: 1x2 Float64 Array:
0.0  0.0    

Note that the stuff that gets displayed is my debug info from cat(). Nothing is returned:

julia> typeof(c)
Nothing

I don't know how to fix this.

papamarkou commented 11 years ago

Cheers bsxfan. Although I am not one of the Julia developers, my modest view is that Julia has matured enough to be used for scientific computing. It needs to advance further in terms of graphics and ease of installation on various platforms, yet I think it is reasonable to claim that Julia is mature enough to be used for technical computing.

From my short experience from reading and coding AD in the forward mode, if you want to use dual arithmetic, it is rather efficient to implement it by storing as few values as possible. For univariate cases, it suffices to define dual numbers indeed, see for instance the Dual type in the DualNumbers package

https://github.com/scidom/DualNumbers.jl

which does what you were intending to do via the DualNum type. In functions defined on a Cartesian domain, it is not efficient to use dual arithmetic. You can store the value of the function once, which results in dual-alike types, such as the ones in src/gradual.jl or src/fad_hessian.jl. Check how the relevant types were defined there.

There is an alternative route if you want to make use of BLAS, the complex step differentiation. Although conceptually complex step differentiation is identical to the forward mode AD, the implementations differ. In the former case, you use a small step h with a value numerically close to zero so as to revert to complex arithmetic to perform AD by using BLAS for instance. If that is what you have in mind, it would be a nice contribution.

As a final note, I wanted to suggest not to invest too much time in forward AD :) It is rather slow and it is mostly useful for testing and benchmarking :) For instance, I am currently focusing my effort on source code transformation, which is in general much faster and can hopefully compete against state of the art AD tools.

tshort commented 11 years ago

That is one weird bug. I can't figure it out.

On Thu, May 2, 2013 at 9:41 AM, Theodore Papamarkou < notifications@github.com> wrote:

Cheers bsxfan. Although I am not one of the Julia developers, my modest view is that Julia has matured enough to be used for scientific computing. It needs to advance further in terms of graphics and ease of installation on various platforms, yet I think it is reasonable to claim that Julia is mature enough to be used for technical computing.

From my short experience from reading and coding AD in the forward mode, if you want to use dual arithmetic, it is rather efficient to implement it by storing as few values as possible. For univariate cases, it suffices to define dual numbers indeed, see for instance the Dual type in the DualNumbers package

https://github.com/scidom/DualNumbers.jl

which does what you were intending to do via the DualNum type. In functions defined on a Cartesian domain, it is not efficient to use dual arithmetic. You can store the value of the function once, which results in dual-alike types, such as the ones in src/gradual.jl or src/fad_hessian.jl. Check how the relevant types were defined there.

There is an alternative route if you want to make use of BLAS, the complex step differentiation. Although conceptually complex step differentiation is identical to the forward mode AD, the implementations differ. In the former case, you use a small step h with a value numerically close to zero so as to revert to complex arithmetic to perform AD using BLAS. If that is what you have in mind, it would be a nice contribution.

As a final note, I wanted to suggest not to invest too much time in forward AD :) It is rather slow and it is mostly useful for testing and benchmarking :) For instance, I am currently focusing my effort on source code transformation, which is in general much faster and can hopefully compete against state of the art AD tools.

— Reply to this email directly or view it on GitHubhttps://github.com/scidom/AutoDiff.jl/issues/5#issuecomment-17338405 .

bsxfan commented 11 years ago

I have since found at least two similar issues that were recently reported and since fixed: https://github.com/JuliaLang/julia/issues/2161 https://github.com/JuliaLang/julia/issues/2562 Those two have in common with mine that construction of a parametrized type is involved.

tshort commented 11 years ago

I managed to get these definitions to work:

vcat(x::DualNum,y::DualNum) = dualnum([x.st, y.st],[x.di, y.di])
hcat(x::DualNum,y::DualNum) = dualnum([x.st  y.st],[x.di  y.di])
bsxfan commented 11 years ago

Thanks Tom!

Those work when you don't stress them too much, but if for example x::DualNum{Float64} and y::DualNum{Float32}, then we get the same problem again. The two components are concatenated sucessfully, via automatic type promotion inside the existing cat functions, but it seems when the stack gets too deep, something breaks.

papamarkou commented 11 years ago

bsxfan, I looked into your code more carefully and it seems it offers an alternative way of using BLAS, which may be more efficient than I imagined, sorry about the first erroneous impression. I hope the bug will be sorted so that we can check how it performs in Julia :)

bsxfan commented 11 years ago

@tshort thanks very much for your effort in reporting this bug! The speed at which some Julia bugs get fixed is impressive. For me, at GMT+2 it feels like the shoemaker (http://en.wikipedia.org/wiki/The_Elves_and_the_Shoemaker) who wakes up in the morning to find his problem solved by the work of nocturnal elves :-)

So now I'll continue to work on my DualNum implementation and let you know in this forum when I think it has reached a useable state. One big chunk of work I still need to do is to add bsxfun capability.

@scidom

papamarkou commented 11 years ago

Thanks bsxfan, this is all very useful. I wish you will contribute even more. It is good to have a general forward-mode AD tool. I haven't tried to implement complex step differentiation, so didn't know that it can lead to complications.

In the meantime, working with expressions to do source code transformation involves a steep learning curve - once I have made some progress I will share it here.

Thanks Tom for helping, really appreciated.