jolars / slopecd

4 stars 2 forks source link

WIP fix: fix bug in thresholder #28

Closed jolars closed 2 years ago

jolars commented 2 years ago

Something is not working correctly with the thresholder function. It doesn't converge for some examples, such as this one.

image

cd_primal_old is the old thresholder function, which works correctly at least for this example.

These are some change to help weed out what the problem is.

jolars commented 2 years ago

I see this:

Epoch: 12, loss: 0.7776163967877969, gap: 7.14e-03
between clusters
between clusters
between clusters, outside loop
between clusters
between clusters, outside loop
j:  4
cluster_indices:  [0 1 2 3 9 8 7 6 5 4]
c:  [0.90675302 0.47672382 0.0469308  0.0207867  0.00389932]
c[j]:  0.003899315241586314
x: 0.03978030871681798
beta_tilde:  0.03978030871681798 , beta_tilde_old: 0.0038103252608220756
ind_old:  4
ind_new:  4 , ind_new_old:  4

Looks like the index of the new cluster returned from the thresholder is wrong. beta_tilde also disagrees between the new and old version.

Klopfe commented 2 years ago

I see this:

Epoch: 12, loss: 0.7776163967877969, gap: 7.14e-03
between clusters
between clusters
between clusters, outside loop
between clusters
between clusters, outside loop
j:  4
cluster_indices:  [0 1 2 3 9 8 7 6 5 4]
c:  [0.90675302 0.47672382 0.0469308  0.0207867  0.00389932]
c[j]:  0.003899315241586314
x: 0.03978030871681798
beta_tilde:  0.03978030871681798 , beta_tilde_old: 0.0038103252608220756
ind_old:  4
ind_new:  4 , ind_new_old:  4

Looks like the index of the new cluster returned from the thresholder is wrong. beta_tilde also disagrees between the new and old version.

I'll look into it. Thanks for spotting it.

jolars commented 2 years ago

Alright, I think I have this sorted out now.

image

This had me stumped for so long that I finally rewrote the function from scratch. It is a little more verbose than our previous version but should be as fast (or even a bit faster since we avoid forming a vector of indexes which we did previously and also because we avoid checking the zero cluster each time, i.e. summing over all lambdas for the zero cluster, which I realize can actually be expensive if the zero cluster is large).

Let me know if you think this looks fine and I'll clean it up and replace our version.

Klopfe commented 2 years ago

Looks great to me! Let's keep the newest version and clean up the files for linter tests. Do you want me to do that @jolars? Thanks a lot @jolars!