ACEsuit / ACE.jl

Parameterisation of Equivariant Properties of Particle Systems
66 stars 15 forks source link

Revisit Multi-Property Management #96

Closed cortner closed 2 years ago

cortner commented 2 years ago

@andresrossb -- I would like us to revisit how we manage multiple properties. To write generic code without lots of duplication, I now believe that SVector{N, Prop} was the right choice, but we need to carry this through to gradients; in particular, grad_config should return a Vector{SVector{N, DState}}. Access would simply change from g[iX, iP] to g[iX][iP] but one can always choose to reinterpret this to access it like a matrix but it would be g[iP, iX].

Can you remind me why we decided to go with returning a matrix? If I change this, what is the knock-on effect for you?

cortner commented 2 years ago

.... related - I just got rid of adjoint_EVAL_D1. My tests so far show that we can also get rid of grad_config and just call _rrule_evaluate with w = 1. Now the only thing left is to remove the multi-property version of adjoint_EVAL_D.

andresrossb commented 2 years ago

We went with matrices since it was compatible with Flux and at the time there wasn't a clear advantage on using svectors. I don't think we would have a problem switching, at least not in principle. We could always add another svector2matrix if needed.

cortner commented 2 years ago

I think I will at least try. I never liked the fact that we had 3 implementations of adjoint_EVAL_D, 3 implementation of genmul, and and 2 implementations of _rrule_evaluate. I'll start it in a separate branch maybe and see how far I get.

E.g. I would now have to optimize a second code-path for multiple properties. (the one for a single property is now fairly fast...)

cortner commented 2 years ago

As I'm looking at the codes, I now wonder whether this is overcomplicating a simple problem. I'm really not sure yet ... I think our two options are

(1) As we do it now - n properties means parameters are Vector{SVector}

(2) There are always n properties, but n could be 1, and we use Matrix for parameters as well as gradients.

Option (1) seems much more flexible, but makes like very complicated. Lots of special contract functions must be implemented, and pre-allocation becomes hellish. At the moment it even looks as if we would beed to implement a new -Vector-of-Propertiestype and then re-implement all functionality we need for it, which will be equally tedious as theStates` and properties interface.

Option (2) - I actually cannot see a reason why this would ever be a bad idea. If one has a single property one can still index into matrix as if it were a vector. And if we make it an nP x ? matrix, then one can always convert to a vector of vectors. On the other Hand, if we make it a ? x nP matrix then we can convert into several vectors. Maybe this is a separate discussion and easy to decide later. And in the end unlikely to be a bottleneck... Another advantage will be that it interfaces more nicely with Flux, AD in general.

cortner commented 2 years ago

@andresrossb Was there any reason we went with SVector{Properties} other than is seemed to fit well into the existing implementation?

andresrossb commented 2 years ago

The initial reason was that it would "just work" by using SVectors instead. It proved a little more complicated, but I remember there being another reason since I initially wanted to use matrices. I don't remember the reason thought.

ACEflux uses matrices anyways. It changes from svectors to matrices and at the end goes back to svectors. When using optimizers we also are switching them to matrices or vectors before taking a step. For example for optim.jl we input a flattened vector that we convert into an SVector that is then fed into ACEflux. It is then converted into a matrix so that it cooperates with Flux and the derivative is then converted into an SVector which is then converted into a flattened vector for optim to update.

cortner commented 2 years ago

Thank you - so at least trying to rewrite everything for matrices would seem natural to you?

andresrossb commented 2 years ago

It wouldn't change how ACEflux works, other than a few conversions. I don't see any strong advantages or disadvantages from my side. I would say whatever seems more natural for other applications. We can always convert within the rrules for anything derivative related. I don't know how much of an overhead this would cause, since SVectors are not mutable. If it's not too much, then I don't think it matters.

andresrossb commented 2 years ago

I do think matrices are more general, so we wouldn't need to implement anything extra, so if there is not strong advantages of SVectors I think these will be easier to deal with in the future. Not only for us, but for new developers and maybe even users. I found SVectors less intuitive and cooperative. Other packages might cooperate better too.

On the other hand SVectors offer a speedup when doing linear algebra, so it could be helpful for our linear models to keep SVectors. We are currently not using this speedup when plugging into optim or flux, but theoretically we could try and leverage it by writing some of these codes our selves, for example the optimiers or dense layers. This could prove to be a lot of work that wouldn't be maintained (since optim and flux are), but maybe the speedup is worth it?

cortner commented 2 years ago

That’s an interesting point.

cortner commented 2 years ago

My guess is we it depends on the order of loops. If the nprop loop is inside this is where we might get that speed up. But we can just make nprop a type parameter and that should automatically unroll the loop.

cortner commented 2 years ago

I think for now I will opt for simplicity...

cortner commented 2 years ago

So the only question left now is whether to make the coefficients nprop x nbasis or nbasis x nprop.

This should be a relatively easy change to make later but still good if we can make a sensible choice now.

cortner commented 2 years ago

I now remember why I didn't want to use the Matrix construction. I always wanted the output of a LinearACEModel to be a property and not a vector containing a property. This still bugs me badly and I can only think of hacky ways to get around it with the Matrix implementation.

cortner commented 2 years ago

I'll need to mull this over and play with both options to understand the consequences of this choice.

cortner commented 2 years ago

given the recent decision, I'll close this, but we can re-open if we decide to come back to the alternative PR.