Multiple properties via SVectors

cortner commented 2 years ago

This is the alternative approach to moving everything to matrices. So far I actually quite like it. It does make things less transparent, but not as badly as I had feared, and definitely gives more flexibility.

There are two problems left to solve:

[ ] The val operation doesn't have correct adjoints. Somehow we need to figure out how to produce adjoints that enforce that gradients have the right "units". E.g. a gradient of an invariants w.r.t. params should be invariant. a gradient of a scalar w.r.t. params should be scalar again.
[ ] For some reason after my "rewrites" and unifications, the second-order gradient for a multi-property model now returns a Vector{Vector} instead of Vector{SVector}. I vaguely remember there is a Flux issue about this. Can we solve it? Ignore it? I'm worried it could cause a severe performance degradation. To be discussed.

CC @andresrossb

cortner commented 2 years ago

I'm now guessing this is exactly a place where matrices would have come in handy. No need to worry about how Zygote handles a Vector{SVector}

andresrossb commented 2 years ago

For the val operation I am thinking of two options:

using something like a typeof function to enforce the type we return inside an rrule.
making a function inside an rrule like we did in ACEflux adjoint() and create multiple instances of it that dispatch according to the dp they recieve.

For the vector{svector} issue is there a way I can reproduce the problem. Is it still persistent in the current commit, and if so is it a problem with adjoint_eval_D? or do you mean the second derivative according to parameters?

andresrossb commented 2 years ago

I'm now guessing this is exactly a place where matrices would have come in handy. No need to worry about how Zygote handles a Vector{SVector}

Here you mean the Vector{SVector} issue mentioned above?

cortner commented 2 years ago

A little update - this turned out to be easier and clearer than I expected. All tests are again passing and here are the performance results: I would say that's within a factor 2 of what we can realistically achieve within our group.

gits/ACE.jl/profile ∙ j17 --project=.. profile_linearmodel.jl                                                                                            1 ↵ ᚴ 𝘮 11 ⇄ 340 multisvec 1M
length(basis) = 2566
[ Info: Time evaluate incl allocation
  58.000 μs (0 allocations: 0 bytes)
[ Info: Time grad_config with and without allocation
[ Info: This looks like a factor 4.5 of evaluate, so probably more we can do
  257.334 μs (6 allocations: 266.61 KiB)
  257.041 μs (5 allocations: 266.31 KiB)
[ Info: grad_config vs rrule_evaluate
g1 ≈ g2 = true
  260.916 μs (6 allocations: 266.61 KiB)
  257.458 μs (6 allocations: 266.61 KiB)
[ Info: Time grad_params with and without allocation
[ Info: a little surprising we dont get closer to factor 1?
  140.291 μs (2 allocations: 20.11 KiB)
  138.916 μs (0 allocations: 0 bytes)
[ Info: adjoint_EVAL_D - single property
[ Info: This may be the single-most important function, and it looks fantastic!
  302.959 μs (8 allocations: 459.83 KiB)
[ Info: Multi-property
[ Info:  - evaluate
  59.792 μs (0 allocations: 0 bytes)
[ Info:  - grad_params
  193.667 μs (2568 allocations: 280.70 KiB)
[ Info:  - grad_config
  338.125 μs (6 allocations: 275.31 KiB)
[ Info:  - _rrule_evaluate
  275.708 μs (5 allocations: 266.59 KiB)
[ Info:  - adjoint_EVAL_D
  363.791 μs (9 allocations: 661.69 KiB)

cortner commented 2 years ago

I'm still not 100% committed to this path and would still consider going the matrix path. But for now my suggestion is to merge this, update ACEflux. and try your tests again to see whether you can now run the regression more efficiently.

cortner commented 2 years ago

@andresrossb If you are happy with this approach, then I'll merge the current branch into you PR. We then update the benchmark tests, confirm that they run and then can merge that too and tag a new version.

andresrossb commented 2 years ago

I'm ok with this path as well, it looks like it's performing way better than before. I don't expect to run into any problems updating ACEflux and trying out the models again.

Let's merge it!

cortner commented 2 years ago

thank you, thank I'll merge and we go from there.

cortner commented 2 years ago

Regarding your comments above:

I'm now guessing this is exactly a place where matrices would have come in handy. No need to worry about how Zygote handles a Vector{SVector}

Here you mean the Vector{SVector} issue mentioned above?

yes. I was able to resolve it without too much difficulty though. It was a bug at my end, but highlights again how matrices is conceptually much simpler.

cortner commented 2 years ago

For the val operation I am thinking of two options:

using something like a typeof function to enforce the type we return inside an rrule.

I agree - this is still missing. I had an idea about that, but it turned out too complex to carry out quickly. We can discuss.

making a function inside an rrule like we did in ACEflux adjoint() and create multiple instances of it that dispatch according to the dp they recieve.

actually I think this is the idea I mention above. I think there is a lot of flexibility in this and we can explore it. We may have to as we move to other properties.

For the vector{svector} issue is there a way I can reproduce the problem. Is it still persistent in the current commit, and if so is it a problem with adjoint_eval_D? or do you mean the second derivative according to parameters?

no need anymore, bug is fixed.

ACEsuit / ACE.jl

Multiple properties via SVectors #98