ACEsuit / ACE.jl

Parameterisation of Equivariant Properties of Particle Systems
66 stars 15 forks source link

WIP: Generic Discrete 1p Basis #54

Closed cortner closed 2 years ago

cortner commented 2 years ago

CC @zhanglw0521 @MatthiasSachs

cortner commented 2 years ago

not at all tested, likely lots of bugs still

cortner commented 2 years ago

this might now be ready for use. Please take a look at test_discrete1pbasis.jl

cortner commented 2 years ago

The way I envision you can use this for bonds is to specify

Rn = ... 
Ylm = ...
Denv = Onehot1pBasis([:bond, :env], :be, :i)   # or replace :be, :i with anything you like
B1p = Rn * Ylm * Penv
cortner commented 2 years ago

If you think this is ready to be tried out please confirm. If there is something missing, then please let me know.

MatthiasSachs commented 2 years ago

Makes all sense to me. I am only wondering what the precise function of the parameter ISYM is.

cortner commented 2 years ago

it is the basis function index symbol, e.g.,

b = (n = 4, l = 3, m = -2, i = :env)
cortner commented 2 years ago

but maybe :i is not a good choice here. e.g. could use :be for "bond or environment"; in fact it would be no problem using :be for both valsym and isym.

cortner commented 2 years ago

@MatthiasSachs do you also agree that calling it a onehot basis fits? I.e. it is really a one hot embedding but used as a 1p basis?

cortner commented 2 years ago

thanks for catching the typo btw, probably means we need more tests.

cortner commented 2 years ago

Would Maybe ISYM be better called BSYM?

cortner commented 2 years ago

ok - some final updates, now with documentation and the testse maybe better show now what those variables mean.

cortner commented 2 years ago

last question before I merge: am I using the term categories correctly here? A variable u may take values [:a, :b, :c]. I'm calling these three values the categories, since u is called a categorical variable. It seems the more common terms is "levels" but I like this even less. @MatthiasSachs - do you have any thoughts??

zhanglw0521 commented 2 years ago

Ah, thank you so much for the latest tests! I can now understand how Onehot1pBasis is to be used for bonds. Maybe I have only one question - why do we call it Onehot1pBasis, or says, what does the term "hot" mean here?

ettersi commented 2 years ago

I think "one hot encoding" is just a programmer idiom that has fairly little to do with its literal meaning. See https://en.wikipedia.org/wiki/One-hot

zhanglw0521 commented 2 years ago

I think "one hot encoding" is just a programmer idiom that has fairly little to do with its literal meaning. See https://en.wikipedia.org/wiki/One-hot

I see, thanks a lot. I shouldn't have separated the word one-hot, sorry about that... Now I think OnehotBasis is a good name to explain what this basis is used for.

cortner commented 2 years ago

I’m happy to rename. CategoricalBasis?

MatthiasSachs commented 2 years ago

Sorry for my late reply. I was somehow expecting to get email notifications on any activity... I really don't have a strong opinion on the naming of the basis. I find that CategoricalBasis best hints at the purpose of this basis while we use a one-hot representation in the implementation of this basis.

MatthiasSachs commented 2 years ago

last question before I merge: am I using the term categories correctly here? A variable u may take values [:a, :b, :c]. I'm calling these three values the categories, since u is called a categorical variable. It seems the more common terms is "levels" but I like this even less. @MatthiasSachs - do you have any thoughts??

Yes, I think that categories rather than levels would be the appropriate term here, since in general there is no natural ordering of the symbols [:a, :b, :c, :someweirdsymbol :anotherweirdsymbol, ... ] that is relevant in the context of the application, right?

cortner commented 2 years ago

But you are happy with Onehot1pBasis or should we switch to Categorical1pBasis?

MatthiasSachs commented 2 years ago

But you are happy with Onehot1pBasis or should we switch to Categorical1pBasis?

I think I would prefer Categorical1pBasis, since I think that for the end-user the form of implementation is not relevant. But again, I don't have a strong opinion on that since "Categorical Basis" seems also a bit weird. But I don't have a better suggestion either...

cortner commented 2 years ago

Unless I hear a dissenting opinion I will rename to Categorical1pBasis, then merge and tag a new version