scikit-learn-contrib / scikit-matter

A collection of scikit-learn compatible utilities that implement methods born out of the materials science and chemistry communities
https://scikit-matter.readthedocs.io/en/v0.2.0/
BSD 3-Clause "New" or "Revised" License
70 stars 18 forks source link

Actually use the copy parameter in StandardFlexibleScaler #138

Closed Luthaf closed 2 years ago

Luthaf commented 2 years ago

Something strange is hapening here: even with this change, I get the following:

# compute per_atom_soap ...
print(np.linalg.norm(per_atom_soap))
scaler = skcosmo.preprocessing.StandardFlexibleScaler(copy=True)

for _ in range(10):

    X = scaler.fit_transform(per_atom_soap)
    print(np.linalg.norm(per_atom_soap))

output:

34.25140803519057
8.889482668045723
48.13522618617981
48.13522618617981
738.8063370659426
48.135226186224045
48.135226186224045
1697.475122564128
48.13522618623458
48.13522618623458
2156.5896572889874

I expect the output to always be the same

Luthaf commented 2 years ago

Same thing happens with separate fit/transform calls:

# compute per_atom_soap ...
print(np.linalg.norm(per_atom_soap))
scaler = skcosmo.preprocessing.StandardFlexibleScaler(copy=True)

for _ in range(10):
    scaler.fit(per_atom_soap)
    X = scaler.transform(per_atom_soap, copy=True)
    print(np.linalg.norm(per_atom_soap))
Luthaf commented 2 years ago

Forget the two previous comments, there is a bug in equistore python bindings & I was using invalid memory. This change is still required but should be fine!