Closed cortner closed 2 years ago
not at all tested, likely lots of bugs still
this might now be ready for use. Please take a look at test_discrete1pbasis.jl
The way I envision you can use this for bonds is to specify
Rn = ...
Ylm = ...
Denv = Onehot1pBasis([:bond, :env], :be, :i) # or replace :be, :i with anything you like
B1p = Rn * Ylm * Penv
If you think this is ready to be tried out please confirm. If there is something missing, then please let me know.
Makes all sense to me. I am only wondering what the precise function of the parameter ISYM is.
it is the basis function index symbol, e.g.,
b = (n = 4, l = 3, m = -2, i = :env)
but maybe :i
is not a good choice here. e.g. could use :be
for "bond or environment"; in fact it would be no problem using :be
for both valsym and isym.
@MatthiasSachs do you also agree that calling it a onehot basis fits? I.e. it is really a one hot embedding but used as a 1p basis?
thanks for catching the typo btw, probably means we need more tests.
Would Maybe ISYM
be better called BSYM
?
ok - some final updates, now with documentation and the testse maybe better show now what those variables mean.
last question before I merge: am I using the term categories
correctly here? A variable u
may take values [:a, :b, :c]
. I'm calling these three values the categories, since u
is called a categorical variable. It seems the more common terms is "levels" but I like this even less. @MatthiasSachs - do you have any thoughts??
Ah, thank you so much for the latest tests! I can now understand how Onehot1pBasis
is to be used for bonds. Maybe I have only one question - why do we call it Onehot1pBasis
, or says, what does the term "hot" mean here?
I think "one hot encoding" is just a programmer idiom that has fairly little to do with its literal meaning. See https://en.wikipedia.org/wiki/One-hot
I think "one hot encoding" is just a programmer idiom that has fairly little to do with its literal meaning. See https://en.wikipedia.org/wiki/One-hot
I see, thanks a lot. I shouldn't have separated the word one-hot, sorry about that... Now I think OnehotBasis is a good name to explain what this basis is used for.
I’m happy to rename. CategoricalBasis?
Sorry for my late reply. I was somehow expecting to get email notifications on any activity... I really don't have a strong opinion on the naming of the basis. I find that CategoricalBasis best hints at the purpose of this basis while we use a one-hot representation in the implementation of this basis.
last question before I merge: am I using the term
categories
correctly here? A variableu
may take values[:a, :b, :c]
. I'm calling these three values the categories, sinceu
is called a categorical variable. It seems the more common terms is "levels" but I like this even less. @MatthiasSachs - do you have any thoughts??
Yes, I think that categories
rather than levels
would be the appropriate term here, since in general there is no natural ordering of the symbols [:a, :b, :c, :someweirdsymbol :anotherweirdsymbol, ... ]
that is relevant in the context of the application, right?
But you are happy with Onehot1pBasis
or should we switch to Categorical1pBasis
?
But you are happy with
Onehot1pBasis
or should we switch toCategorical1pBasis
?
I think I would prefer Categorical1pBasis
, since I think that for the end-user the form of implementation is not relevant. But again, I don't have a strong opinion on that since "Categorical Basis" seems also a bit weird. But I don't have a better suggestion either...
Unless I hear a dissenting opinion I will rename to Categorical1pBasis, then merge and tag a new version
CC @zhanglw0521 @MatthiasSachs