DifferentiationInterface allows benchmarking

gdalle commented 7 months ago

Hey @brianguenter, just to let you know that my WIP package also includes some nice utilities for benchmarking AD backends against each other, as demonstrated in the tutorial:

https://gdalle.github.io/DifferentiationInterface.jl/dev/tutorial/#Benchmarking

My FastDifferentiation extension does not yet have optimal performance (you're welcome to help!), and neither do the extensions for other AD backends, but I thought it might be useful for you to sell your package :)

brianguenter commented 7 months ago

What performance problems are you seeing in the FastDifferentiation extension? Will definitely help make this better once I know what the problems are.

Could you add some benchmarks for computing Jacobians as well? That's where FastDifferentiation shines, especially for sparse Jacobians. Even for dense Jacobians FastDifferentiation seems to scale asymptotically better than other algorithms.

gdalle commented 7 months ago

It's not really problems I'm seeing, it's problems i have introduced by coding the easiest version first.

For instance, I compile executables that don't include the primal value, so it's rather fast when I do 'pushforward', but slower when I do 'value_and_pushfoward' due to the additional call to f. I guess ideally I should have one executable without the primal, and one where the primal is concatenated with the tangent. I also need to make all executables in-place if I can. And I need to handle the case of functions that mutate their input.

All of this I mostly understand how to do, it's just a question of the amount of code that I can manage to write without egregious duplication

gdalle commented 7 months ago

I'll try a sparse jacobian benchmark tonight to show you what it looks like. Any function you fancy? I really don't know what to benchmark on

brianguenter commented 7 months ago

There are a few benchmarks here that I have used in the past: https://github.com/brianguenter/Benchmarks

gdalle commented 7 months ago

Typically, this is what you would do if you wanted to use DI to benchmark (see the docs for more details):

Open a new Julia REPL in a temporary environment
Add the package https://github.com/gdalle/DifferentiationInterface.jl
Add the subpackage https://github.com/gdalle/DifferentiationInterface.jl/tree/main/lib/DifferentiationInterfaceTest
Add any backend you may need, and DataFrames.jl
Run the following snippet

using DifferentiationInterface
using DifferentiationInterfaceTest

using FastDifferentiation: FastDifferentiation
using ForwardDiff: ForwardDiff
using ReverseDiff: ReverseDiff

using DataFrames

function rosenbrock(x)
    a = one(eltype(x))
    b = 100 * a
    result = zero(eltype(x))
    for i in 1:(length(x) - 1)
        result += (a - x[i])^2 + b * (x[i + 1] - x[i]^2)^2
    end
    return result
end

backends = [
    AutoFastDifferentiation(),
    AutoSparseFastDifferentiation(),
    AutoForwardDiff(),
    AutoReverseDiff(; compile=true),
]

scenarios = [
    GradientScenario(rosenbrock; x=rand(10)),
    GradientScenario(rosenbrock; x=rand(100)),
    HessianScenario(rosenbrock; x=rand(10)),
    HessianScenario(rosenbrock; x=rand(100)),
]

data = benchmark_differentiation(backends, scenarios; logging=true)
df = DataFrame(data)

The object df contains all the info you could need.

Note that Chairmarks.jl uses a fairly short timeout, which is why we see that most of the time is spent compiling for FastDifferentiation.jl (only one or two samples fit inside the allotted time, at least on my laptop). I should probably increase the seconds parameter, but at least you get the gist.

I think the awful performance on gradient is due to https://github.com/gdalle/DifferentiationInterface.jl/issues/131 anyway, so I'll be grateful for your help in debugging that

brianguenter / Benchmarks

DifferentiationInterface allows benchmarking #2