The problem occurs when max_iter is very large ie.: 50000
1) The Outer FISTA max_iter is 50000,
2) The Outer FISTA set the max_iter of the inner FISTA to the same value "max_iter" = 50000
3) According to the theory the eps of the inner loop should decrease with the iteration of the outer loop (=1.0 / (float(i) \ (4.0 + consts.FLOAT_EPSILON)))
But the problem is that the inner loop always hit max_iter=50000, which make it VERY slow (hours) for a 50 x 50 x 1 x 300(samples) dataset
Tommy, Fouad, what should we do ?
1) Limiting the number of iteration in the inner loop ? This breaks the theory.
2) Limiting the eps in the inner loop: which also breaks the theory. If we chose solution I suggest to limit eps to a relevant numerical precision: consts.FLOAT_EPSILON
so NesterovFunction.prox(..., eps) will start with something like:
eps = max(consts.FLOAT_EPSILON, eps)
Do you have any other idea ? We will have to argue that choice in the paper. I think the solution 2
eps = max(consts.FLOAT_EPSILON, eps) is easily arguable, since it is useless of dig further than consts.FLOAT_EPSILON.
The problem occurs when max_iter is very large ie.: 50000
1) The Outer FISTA max_iter is 50000, 2) The Outer FISTA set the max_iter of the inner FISTA to the same value "max_iter" = 50000 3) According to the theory the eps of the inner loop should decrease with the iteration of the outer loop (=1.0 / (float(i) \ (4.0 + consts.FLOAT_EPSILON)))
But the problem is that the inner loop always hit max_iter=50000, which make it VERY slow (hours) for a 50 x 50 x 1 x 300(samples) dataset
Tommy, Fouad, what should we do ?
1) Limiting the number of iteration in the inner loop ? This breaks the theory.
2) Limiting the eps in the inner loop: which also breaks the theory. If we chose solution I suggest to limit eps to a relevant numerical precision: consts.FLOAT_EPSILON
so NesterovFunction.prox(..., eps) will start with something like: eps = max(consts.FLOAT_EPSILON, eps)
Do you have any other idea ? We will have to argue that choice in the paper. I think the solution 2 eps = max(consts.FLOAT_EPSILON, eps) is easily arguable, since it is useless of dig further than consts.FLOAT_EPSILON.