Exercise #2: convergence criterion and scaling factor

uwhpsc-2016 / homework1

Homework #1

1 stars 1 forks source link

Exercise #2: convergence criterion and scaling factor #26

Open chengdang opened 8 years ago

chengdang commented 8 years ago

I have two questions regarding Exercise #2: (1) I guess I'm puzzled by the convergence criterion given in Exercise #2, step 4: .... While abs(x_k+1 - x_k) > epsilon ..... Shouldn this be: abs ( f(x_k+1) - f(x_k) ) > epsilon? I assume it is derived from the mathematical request of df/dx = 0, in practice, abs(df) <= epsilon

(2) The scaling factor, should be the 'step' size that we chose to modify x_k in each step. Ideally it needs to be small enough, but why should it be smaller than 1, not 0.5 or 2, in this specific case? per request in test 4.

Thank you!

alyfarahat commented 8 years ago

As given in the homework statement, I believe that the algorithm checks for the convergence of the map x(k+1) = x(k) - sigma f'(x(k)) to a fixed-point. The fixed-point is detected through the condition given in the problem statement. Checking that f(x_{k+1}) and f(x_k) are close will work only if xk and x{k+1} are close enough; which is actually what we are checking for.

cswiercz commented 8 years ago

To add to @alyfarahat , if the function f is continuous (which we can assume for this problem) then | xkp1 - xk | < delta implies that | f(xkp1) - f(xk) | < epsilon for an appropriately chosen epsilon. Further, it delta is sufficiently small then epsilon is small. (Calculus.)

We test for the former instead of the latter because of the following: what if it is really expensive to simply evaluate the functionf? We would want to keep the number of function calls small.

chengdang commented 8 years ago

Thank you Chris and alyfarahat. In this case, it seems we don't need to include function f as an input for gradient descent? For my second question, why did we pick 1 as the upper limit of step size sigma? unless it is just a random number we pick to practice ValueError. Thank you!

quantheory commented 8 years ago

For sigma, you typically want to pick a "small" value, but what that means depends on the problem; I believe that you're correct that 1 is a somewhat arbitrary limit for the purposes of this problem. Note for instance that if x and f(x) have physical units, sigma is not dimensionless, so an appropriate value of sigma must depend on the relative magnitude of f, f', and x given the choice of units for a particular problem.