I would think of the index function as mapping the projected data onto a scalar value, and while we optimise over different projections A, we also need the data X to compute an index value. Maybe the notation can be a bit cleaner on this
The introduction should make the connection to DFO, use some language introduced in Sec 2, and making clear what is special about projection pursuit (i.e. this is optimisation that is often for the purpose of visualisation)
You could be more clear when introducing the tour in general, and then the guided tour as a particular example that is connected (how?) with projection pursuit
Instead of "outputted" maybe "accepted" is better for referring to the selected target bases?
Eq (2): I would write this with a Kronecker delta
There are more than three algorithms that have been used with projection pursuit, but you are describing the ones available in the tourr package
When introducing the algorithms, I think you can shorten the main text (where it is repeating information given in the algorithm blocks) and instead focus on the different concepts
Why do you introduce "c" in Algorithm 1 but not use it?
You should mention that for simulated annealing we can jump out of local maxima
The final paragraph (page 8) could refer to the illustration in Fig 1, and maybe repeat the stopping criterion