Open aiwen324 opened 4 years ago
The f
is used when determining the step size in the line search. The heuristic we're using here is that we want each gradient descent step to decrease f
, so we use line search to make sure that we're actually doing this, while also avoiding unnecessarily small steps that will make us take a long time to reach a local minimum.
Do we have to find the smallest value? Say if we achieve the minimum at 50th iter then we should assign the z-value at 50th iter instead of z in the last iter?