wiskott-lab / sklearn-sfa

This project provides Slow Feature Analysis as a scikit-learn-style package.
BSD 3-Clause "New" or "Revised" License
39 stars 6 forks source link

Is there a way to reconstruct input from `n` components? #4

Closed diegoasua closed 2 years ago

diegoasua commented 2 years ago

Hi there. I am trying to perform blind signal separation of time series data via dimensionality reduction and then back-projecting a latent representation using only n first components. In scikit-learn matrix factorization modules have a property components_ that can be used for this purpose. For example for sklearn.decomposition.PCA signal separation can be achieved back-projecting high-variance components with something like:

estimator = PCA(n_components=COMPONENTS)
model = estimator.fit(data_array)
transformed = model.transform(data_array)
cutoff = 2
clean_signal = np.dot(model.transform(new_data)[:,:cutoff], model.components_[:cutoff,:])
clean_signal += np.mean(new_data, axis=0)

I do not see a components_ property in sklearn-sfa. Is there an equivalent property in sksfa.SFA?

MoritzLange commented 2 years ago

Hey Diego, yes, you can do this (at least if you have not used partial_fit for fitting SFA).

Similar to the components_ variable in sklearn.decomposition.PCA you can extract the transformation matrix W (and a bias b) by using the affine_parameters() function of an sksfa.SFA object. The only difference is that while components_ contains all singular vectors, W only contains the vectors for the first n components. But for reconstruction, those first n are of course all you need.

You can calculate the latent representation z and then the reconstruction as

sfa_model = sksfa.SFA(n_components=n)
fitted_model = sfa_model.fit(data_array)
W, b = fitted_model.affine_parameters()
z = np.dot(new_data, W.T) + b  # Equivalent to: z = fitted_model.transform(new_data)
reconstruction = np.dot(z - b, W)

Some justification: The SFA procedure basically performs two consecutive PCA operations; first whitening and then one based on temporal differences. The product of both those orthogonal transformation matrices (i.e. the two components_ matrices) still yields an orthogonal matrix (which is then truncated to get W) so that you can still use the usual way known from PCA to reconstruct original data.