jolars / slopecd

4 stars 2 forks source link

Test hybrid and oracle strategies #3

Closed mathurinm closed 2 years ago

mathurinm commented 2 years ago

There may still be bugs in the code. This script does a prox_grad step, then n_cd CD epochs using a fixed order of alphas/w. It's very rough for now, and it seems this could provide speed-ups in an itinital phase, but then it fails to converge (as expected since this CD strategy clearly does not have the grouping effect)

image

I quickly tested with a constant sequence of lambdas that we get a Lasso behavior, where increasing n_cd gives better results

Klopfe commented 2 years ago

great thanks for the first results.

Capture d’écran 2022-02-09 à 10 12 15

Here I tried to compute the clusters after each update and use Block CD on clusters. For the clusters to be able to change, I alternate between Block CD on clusters and PGD

jolars commented 2 years ago

Nice! I am going to play around with this a bit and try out some ideas too.

jolars commented 2 years ago

I see this when n_samples = 100 and n_features = 200.

image

By the way, in the example your gave previously, with n_features=40, there are just 38 clusters (just one cluster with more than one feature) and no zeros.

I tried using rho = 0.7 and alphas = alpha_max * alphas_seq * 0.25, and same settings as otherwise and get:

image

There are 28 zeros here and 10 clusters, two with more than one non-zero feature in them.

Klopfe commented 2 years ago

It shows that we still need to tune the strategy. thanks for the feedbacks

mathurinm commented 2 years ago

I have added an oracle strategy according to the idea discussed with @JonasWallin : find clusters in solution, collapse X with appropriate sign, run CD on the reduced problems. It was both a sanity check to see if it worked, given the knowledge of clusters, and I consider it to be a lower bound, the best possible improvement image

@jolars maybe we can cleanup this PR and merge it ?