Roadmap New ACE Models - Githubissues

cortner commented 3 months ago

This issue is to follow-up on #195 , to list remaining open issues and to discuss any further work that needs to be added to bring the new implementation to a stage where (1) it reproduces the old ACE1 results (within some error bar); (2) then match some of the nonlinear ACE results e.g. in Ralf's group; and (3) finally can we go beyond all these?

Open issues

[ ] Think about interfacing with the established user-facing ACEpotentials workflow (@wcwitt ??)
[ ] The radial basis currently has maxn = maxq, which is of course ridiculous, but it doesn't prevent functionality testing so I'll leave it for now.
[ ] The pair potential basis currently uses random trace-like mixing. Should implement a one-hot initialization of the rbasis weights and this could be taken as the default for the pair basis.
[ ] Give the Rnlzrr basis the option of initializing in different ways, in particular just using the polynomials (which only makes sense for a single species...)
[ ] purify 2b (@CheukHinHoJerry)
[ ] Outer nonlinearities such as FS, MLP, or a BOP-like thing.
[ ] general rational transforms, both frozen and learnable, to unify everything we have. Explore Polynomials and StaticPolynomials packages to manage the algebra. (@TSGut ??)
[ ] Implement sparse_ace_spec as commented in models/utils.jl
[ ] How to incorporate smoothness priors. Ideally we should just incorporate them through a rescaling the A basis (or possible the Rnl, Ylm bases)
[ ] pushforwards of main kernels for faster jacobians and basis_derivatives
[ ] weird stuff going on with units. to be discussed and cleaned up. Should we enforce usage of units? Could prevent mabny bugs!
[ ] writing models to and from files...
[ ] rewrite E0s to allow general Vref, probably best by utilizing a parameterized stacked potential rather than doing that stacking internally. That could increase flexibility and reduce boilerplate code.
[ ] discuss whether we ever want to allow complex SH/real SH again?

List of test datasets

3BPA : Jerry and Yangshuai
FeCrNi : James K

cortner commented 3 months ago

@jameskermode -- can we avoid spins for now please? I am happy to extend the model to allow this. But one thing at a time ... I think for now the big question is whether or not there is still a space for slim ACE-like models.

If the other allow (despite the Fe in it) is fine then I'd love to have a test set up.

Can we collect datasets we are working with somewhere easily accessible for all?

cortner commented 3 months ago

@CheukHinHoJerry can you please document here what allow datasets you are playing with ?

jameskermode commented 3 months ago

@jameskermode -- can we avoid spins for now please? I am happy to extend the model to allow this. But one thing at a time ... I think for now the big question is whether or not there is still a space for slim ACE-like models.

No model extension is required, we could repeat our trick of representing Fe up and Fe down as two separate species (e.g. with Fe and F).

If the other alloy (despite the Fe in it) is fine then I'd love to have a test set up.

But we can start by fitting a no-spin model to the same FeCrNi dataset and comparing to our trSOAP results from the paper.

Can we collect datasets we are working with somewhere easily accessible for all?

Yes, we can do this. Would ACEworkflows be a good place or should we start a new repo?

cortner commented 3 months ago

I don't mind very much. Are we thinking about working towards a publication? several? In that case a separate repo might be good. the best examples could then be moved into ACEworkflows afterwards?

cortner commented 3 months ago

FeCrNi dataset and comparing to our trSOAP results

I think this is ideal for initial tests. If we cannot match that, then there is not much point trying something harder.

CheukHinHoJerry commented 3 months ago

I am first playing with the AgPd dataset (https://www.nature.com/articles/s41524-020-00477-2#Sec9) and also 3BPA.

jameskermode commented 3 months ago

I don't mind very much. Are we thinking about working towards a publication? several? In that case a separate repo might be good. the best examples could then be moved into ACEworkflows afterwards?

Yes, this sounds good. If you create a private repo we have some other not-yet-published datasets that could be added.

cortner commented 3 months ago

I am first playing with the AgPd dataset (https://www.nature.com/articles/s41524-020-00477-2#Sec9) and also 3BPA.

Sounds good, thank you. I wonder whether MD17r is easier though for initial tests? Others should advise.

cortner commented 3 months ago

Yes, this sounds good. If you create a private repo we have some other not-yet-published datasets that could be added.

Maybe that will be easiest for the time being.

YangshuaiWang commented 3 months ago

I am first playing with the AgPd dataset (https://www.nature.com/articles/s41524-020-00477-2#Sec9) and also 3BPA.

Sounds good, thank you. I wonder whether MD17r is easier though for initial tests? Others should advise.

Yes, we actually collected various published datasets, and e.g. aspirin (C,H,O) from MD17 should be easier than 3BPA (C,H,O,N).

cortner commented 3 months ago

created repo. Let me know if I should add anybody else to it. Please look at the issue about how to manage the data. I don't generally like datasets in github repos but maybe the rules have changes and I need to adapt...

jameskermode commented 3 months ago

created repo. Let me know if I should add anybody else to it.

from my group please add @thomas-rocke

CheukHinHoJerry commented 3 months ago

After merging this PR https://github.com/ACEsuit/ACEpotentials.jl/pull/200#issue-2329024400 I think the next thing to do is look at the weight initialization. With the experience of playing with the new ACEmodel up to now I think this is important

The pair potential basis currently uses random trace-like mixing. Should implement a one-hot initialization of the rbasis weights and this could be taken as the default for the pair basis.

We should probably have another issue discussing weight initialization? I believe certain multilevel strategy will definitely be helpful.

cortner commented 3 months ago

Yes please open it. It is a big enough topic to discuss separately.

ACEsuit / ACEpotentials.jl

Roadmap New ACE Models #196

Open issues

List of test datasets