Open ksehic opened 3 years ago
I was playing with p_grad_norm and for a very small p_grad_norm like 0.01, we do not have this bug. However, gradients still increase instead to converge to zero
Dataset: rcv1_train Iteration 1/5 ||Value outer criterion: 2.16e-01 ||norm grad 1.30e-01 Iteration 2/5 ||Value outer criterion: 2.16e-01 ||norm grad 1.42e-01 Iteration 3/5 ||Value outer criterion: 2.16e-01 ||norm grad 2.25e-01 Iteration 4/5 ||Value outer criterion: 2.16e-01 ||norm grad 3.93e-01 Iteration 5/5 ||Value outer criterion: 2.16e-01 ||norm grad 6.60e-01
I tried the same case with your other grad optimizers like ADAM and line search.
optimizer = Adam(n_outer=5, lr=0.11, verbose=True, tol=tol
We found the same strange behavior with ADAM, gradients increase with each iteration.
Dataset: rcv1_train Iteration 1/5 || Value outer criterion: 2.16e-01 || norm grad 1.30e-01 Iteration 2/5 || Value outer criterion: 2.12e-01 || norm grad 1.87e-01 Iteration 3/5 || Value outer criterion: 2.08e-01 || norm grad 3.62e+28 Iteration 4/5 || Value outer criterion: 2.08e-01 || norm grad 1.72e+56
optimizer = LineSearch(n_outer=5, verbose=True, tol=tol)
Line search seems to be stable but has strange jumps from one iteration to another one.
Dataset: rcv1_train Iteration 1/5 || Value outer criterion: 2.16e-01 || norm grad 1.30e-01 Iteration 2/5 || Value outer criterion: 2.18e-01 || norm grad 1.22e-01 Iteration 3/5 || Value outer criterion: 2.17e-01 || norm grad 1.25e-01 Iteration 4/5 || Value outer criterion: 2.18e-01 || norm grad 1.36e-01 Iteration 5/5 || Value outer criterion: 2.22e-01 || norm grad 1.44e-01
Hi @QB3
I was running Sparse-HO for rcv1_train with gradient descent and found a very strange bug. The first iteration would be OK, but then gradients would blow up and result in the bug "ValueError: 0 or negative weights are not supported." It is alpha index 71 that generates the bug...
ValueError is related to celer_path function.
This is the code that you can use to rerun