Closed mathurinm closed 2 years ago
great thanks for the first results.
Here I tried to compute the clusters after each update and use Block CD on clusters. For the clusters to be able to change, I alternate between Block CD on clusters and PGD
Nice! I am going to play around with this a bit and try out some ideas too.
I see this when n_samples = 100
and n_features = 200
.
By the way, in the example your gave previously, with n_features=40
, there are just 38
clusters (just one cluster with more than one feature) and no zeros.
I tried using rho = 0.7
and alphas = alpha_max * alphas_seq * 0.25
, and same settings as otherwise and get:
There are 28 zeros here and 10 clusters, two with more than one non-zero feature in them.
It shows that we still need to tune the strategy. thanks for the feedbacks
I have added an oracle strategy according to the idea discussed with @JonasWallin : find clusters in solution, collapse X with appropriate sign, run CD on the reduced problems. It was both a sanity check to see if it worked, given the knowledge of clusters, and I consider it to be a lower bound, the best possible improvement
@jolars maybe we can cleanup this PR and merge it ?
There may still be bugs in the code. This script does a prox_grad step, then
n_cd
CD epochs using a fixed order of alphas/w. It's very rough for now, and it seems this could provide speed-ups in an itinital phase, but then it fails to converge (as expected since this CD strategy clearly does not have the grouping effect)I quickly tested with a constant sequence of lambdas that we get a Lasso behavior, where increasing n_cd gives better results