BattModels / ElectrochemicalKinetics.jl

Electrochemical reaction rate modeling and nonequilibrium phase maps (via AD)
MIT License
21 stars 8 forks source link

Add `onlyfj!` API to `fit_overpotential` #38

Closed DhairyaLGandhi closed 2 years ago

DhairyaLGandhi commented 2 years ago

This avoids the calculation of the primal in the case we want both the primal and the gradients. Helpful for IntegralModel where the calculation of the primal can be expensive.

DhairyaLGandhi commented 2 years ago

ref #30

rkurchin commented 2 years ago

(rebased this so it has the benchmarks)

rkurchin commented 2 years ago

Ran pretty exhaustive versions of just the fit_overpotential benchmarks on this like so (only on the relevant sections of the suite, set time_mult=5):

# on this branch...
using PkgJogger, BenchmarkTools
@jog ElectrochemicalKinetics
suite = JogElectrochemicalKinetics.suite()[@tagged "overpotential fitting" && "AD"]
tune!(suite)
results_fj = run(suite)

# then switch to master branch...
suite2 = JogElectrochemicalKinetics.suite()[@tagged "overpotential fitting" && "AD"]
loadparams!(suite2, params(suite)) # so we don't have to retune
results_current = run(suite2, verbose=true)

and then compared them:

julia> PkgJogger.judge(results_fj, results_current)
1-element BenchmarkTools.BenchmarkGroup:
  tags: []
  "bench_suite.jl" => 1-element BenchmarkTools.BenchmarkGroup:
          tags: []
          "overpotential fitting" => 2-element BenchmarkTools.BenchmarkGroup:
                  tags: []
                  "non-integral models" => 1-element BenchmarkTools.BenchmarkGroup:
                          tags: []
                          "AD" => 3-element BenchmarkTools.BenchmarkGroup:
                                  tags: []
                                  "ElectrochemicalKinetics.Marcus" => TrialJudgement(+176.12% => regression)
                                  "ElectrochemicalKinetics.AsymptoticMarcusHushChidsey" => TrialJudgement(+2.39% => invariant)
                                  "ElectrochemicalKinetics.ButlerVolmer" => TrialJudgement(+109.36% => regression)
                  "integral models" => 1-element BenchmarkTools.BenchmarkGroup:
                          tags: []
                          "AD" => 2-element BenchmarkTools.BenchmarkGroup:
                                  tags: []
                                  "MarcusHushChidseyDOS: Cu111" => TrialJudgement(+6.68% => regression)
                                  "MarcusHushChidseyDOS: flat dos" => TrialJudgement(+7.79% => regression)

...so I want to dig into this more before merging, e.g. actually looking at solve traces etc. Will hopefully have time to start playing with that tomorrow.

DhairyaLGandhi commented 2 years ago

Interesting. And this is just with the changes in the PR which should technically not change the code paths at all. What about running the fit_overpotential manually? Does it reflect the same regression?

DhairyaLGandhi commented 2 years ago

I think I know what's happening. Its the case of NLsolve forcing us to use the vector case by default even for a real input.

On master we have https://github.com/BattModels/ElectrochemicalKinetics.jl/blob/9ade0a92769586ba5350584f11ccaf01c6eed799/src/fitting.jl#L25 to get around this, but this is going to be wrong for vector inputs. In the PR, it is generic.

Accounting for that by applying the same patch to this PR gives:

# PR
julia> @btime fit_overpotential($bv, $(I_vals[1]), $(fill(0.1, 1)), verbose = false);
  40.149 μs (736 allocations: 104.19 KiB)

# master
julia> @btime fit_overpotential($bv, $(I_vals[1]), $(fill(0.1, 1)), verbose = false);
  109.627 μs (1302 allocations: 151.39 KiB)
DhairyaLGandhi commented 2 years ago

Here is the diff I am applying

diff --git a/src/fitting.jl b/src/fitting.jl
index 2c24d23..f843eaa 100644
--- a/src/fitting.jl
+++ b/src/fitting.jl
@@ -25,14 +25,14 @@ function fit_overpotential(model::KineticModel, k, guess = fill(0.1, length(k));
                 F .= loss(k, compute_k(V, model; kT = kT, kwargs...))
             elseif F == nothing && !isnothing(J)
                 # TODO: we could speed up the case of scalar k's by dispatching to call gradient here instead
-                gs = Zygote.gradient(V) do V
+                gs = Zygote.gradient(V[1]) do V
                     Zygote.forwarddiff(V) do V
                         loss(k, compute_k(V, model; kT = kT, kwargs...)) |> sum
                     end
                 end[1]
                 J .= gs
             else
-                y, back = Zygote.pullback(V) do V
+                y, back = Zygote.pullback(V[1]) do V
                     Zygote.forwarddiff(V) do V
                         loss(k, compute_k(V, model; kT = kT, kwargs...)) |> sum
                     end
DhairyaLGandhi commented 2 years ago

I am slightly worried that cached code can still pollute the results. Further, it seems to only do 1 eval for some reason. I noticed that PkgJogger.jl allows saving benchmarks which means we can avoid the slightly opaque issue of cached functions. However, that would mean that we can no longer make use of Revise effectively and have to pay tuning price (which tbf should be done separately for every benchmarking suite anyway. Finally these files can be passed to judge directly. Could we try that method once please?

rkurchin commented 2 years ago

it seems to only do 1 eval for some reason.

I was confused by that too, but that's actually the default – it's doing at least hundreds of samples within that eval with the timings I've set, so I think that should be fine.

Could we try that method once please?

For sure. For my own sanity, I tried the way I had been doing it but applying the diff you showed on your branch. Results there (I've edited the outputs for readability):

julia> PkgJogger.judge(results_fj, results_master)
             "ElectrochemicalKinetics.Marcus" => TrialJudgement(+170.24% => regression)
             "ElectrochemicalKinetics.AsymptoticMarcusHushChidsey" => TrialJudgement(+184.76% => regression)
             "ElectrochemicalKinetics.ButlerVolmer" => TrialJudgement(+163.78% => regression)
             "MarcusHushChidseyDOS: Cu111" => TrialJudgement(-4.42% => invariant)
             "MarcusHushChidseyDOS: flat dos" => TrialJudgement(-3.78% => invariant)

So that would seem to suggest there is some caching or something happening.

So then I ran with new REPL sessions and separate re-tuning for each branch:

julia> PkgJogger.judge("../results_fj.json.gz", "../results_master.json.gz")
             "ElectrochemicalKinetics.Marcus" => TrialJudgement(-0.13% => invariant)
             "ElectrochemicalKinetics.AsymptoticMarcusHushChidsey" => TrialJudgement(+0.21% => invariant)
             "ElectrochemicalKinetics.ButlerVolmer" => TrialJudgement(+0.29% => invariant)
             "MarcusHushChidseyDOS: Cu111" => TrialJudgement(-2.19% => invariant)
              "MarcusHushChidseyDOS: flat dos" => TrialJudgement(-1.77% => invariant)

Not really sure what the deal is here...is doing "restart REPL" in VS Code not enough to totally clear things? Like, here I almost wonder if it's running exactly the same thing over...I guess I could actually quit out of VS Code between as well, I haven't tried that...

Anyway, for a third check, I (again in separate REPL's) explicitly ran the @benchmark command on these fit_overpotential expressions. To save space, just reporting median times here, on this PR vs. on master:

Also, to be explicit about this, all these benchmarks are for fitting a single scalar value. Probably given what you note above, we'll eventually need separately dispatches for scalars and vectors, but that can be handled. I do want to add vector benchmarks for these eventually, obviously.

My inclination is to trust the last check and say we may as well merge this, and leave the broader benchmarking reproducibility puzzle for another time. Your thoughts?

DhairyaLGandhi commented 2 years ago

I would trust the last ones too. There's at least a trend with the integral models, and the kinetic models are within my estimate of run to run variance for 2 of them which is expected.

could actually quit out of VS Code between as well

I doubt that's necessary.

leave the broader benchmarking reproducibility puzzle for another time

Agree