mitscha / ssc_mps_py

Python implementation of SSC-OMP and SSC-MP
3 stars 6 forks source link

extracting the subspaces #1

Open rana-alshaikh opened 5 years ago

rana-alshaikh commented 5 years ago

First, I want to thank you for your effort, I used the Matlab code for the SSC_OMP algorithm before to find thematic clusters but since now I am working in python it was very helpful to find the python implementation. My question is how to find the low dimension subspaces? The algorithm in Matlab and your implementation return the labels(groups), but from the paper, I understood that we could obtain the subspaces as well. Let assume that I have a matrix 100X300 (sample, features) and I set the number of the subspaces = 3, and the dimension of each one of them is 10d, now how I obtain 10 dimension representation for each point in each group. I tried from Matlab but I only could extract the groups and the affinity matrix.

I appreciate any advice and thanks again

mitscha commented 5 years ago

To get an orthonormal basis of each subspace you can simply go through the clusters and select for each cluster the data points (rows in your matrix) that belong to it. Then, compute the SVD of the resulting matrix and select the 10 right singular vectors corresponding to the largest singular values. These are an estimate of the orthonormal basis of the subspace for the current cluster. You can then project the points into that basis.

rana-alshaikh commented 5 years ago

Thanks for your quick respond Just to make sure that I understand you correctly Do I need to separate the original matrix or the affinity matrix?

mitscha commented 5 years ago

You'll have to collect the rows from the original matrix (data matrix).