RussTedrake / manipulation

Course notes for MIT manipulation class
BSD 3-Clause "New" or "Revised" License
407 stars 119 forks source link

Fix Exercise 11.1 #231

Open hjsuh94 opened 1 year ago

hjsuh94 commented 1 year ago

The stochastic approximation has

x <- x - eta * [ l(x + w) - l(x) ] w

but the true gradient should be

x <- x - eta * [ l(x + w) - l(x) ] w / sigma^2

so we seem to be off at a scale.

This exercise is also bit misleading since it gives you the impression that we got out of local minima because we did zeroth-order, when in fact, we could have achieved the same effect by doing first order while injecting stochasticity.

I'd love to reimplement this problem with what we've learned with various gradient estimators over the past year.