Closed colah closed 6 years ago
Not super happy with this, but it's a start.
It's a really nice starting point IMHO :) I think we need a more concrete explanation with references to back it up. I'll iterate over it on Monday
I changed the text quite a bit. I tried to be a bit more pedantic and adding a bit more explanation that should help the more casual reader in disentangling what we propose.
Here's few points: 1) it's the same optimization problem -> I'd say that the problem is different but the objective for which we optimize is the same.
2) can have such dramatic effects -> I think that the reader is not ready at this point since we didn't introduce any result
3) I think that the linear/non-linear example in Improved optimization can be misleading since non-linear preconditioners exist. I tried to keep it general -> preconditioning is used to optimize a better conditioned problem -> we can precondition simply by changing to a different parameterization
4) I avoided making forward reference to the sections. I'd rather give an overview of the reasons here and make the links back from the sections
5) 2- is not clear to me. What do we mean with computational complexity of the image?
6) added https://arxiv.org/abs/1412.0233 to improve 3 which is now rephrased in terms of optimization landscape.
TODO: once we converged with the text and definitions it good to make a reference in the description of the subsections to this points (e.g. align interpolation is an example of parameterization that changes the basisns of attractions)