Making transforms more efficient

Trying to coalesce a few things we've been discussing regarding making transforms more efficient.

1. Partial `basis.evaluate`

When calling basis.evaluate you should be able to specify which coordinates to actually evaluate. Default would be all of them, but evaluating only radial eg would make the next step easier. It would also modularize things better (right now we directly call zernike_radial several places where we really mean basis.evaluate but only in the r coordinate), making it easier to swap in different basis types (finite element, chebyshev, etc)

2. Partial `transform.transform`

Similarly, transform should allow you to transform only one or two coordinates at a time. For example, instead of lmn -> rtz you could do lmn -> rmn to only evaluate the radial part. This would mean breaking the transform matrices into individual parts, which we sort of already do because for FFTs we only use matrices for the radial/poloidal part. This should also make it easier to do poloidal FFTs (#641).

3. Allow you to promise "same nodes" when updating `transform.grid`

ie, transform.update_grid(new_grid, promise_same="r") you would be saying this grid is the same as the old one in the radial direction, so that you only need to recompute the poloidal and toroidal part of the transform matrices. So for example, in map_coordinates assuming rho is both an input and output coordinate:

transforms = get_transforms(names..., grid=initial_guess_grid)
rho = coords[:,0]

def rootfun(x):
    nodes = np.stack([rho, x]) # x is only theta/zeta
    new_grid = Grid(nodes)
    transforms.update_grid(new_grid, promise_same="r")
    data = _compute_fun(names, params, transforms, ...)

this is more along the lines of "partial evaluation" rather than "partial summation" but would likely see similar improvements (#1154)

4. Move logic from `transform._check_inputs_fft` to `Grid` and `Basis`

For most grid classes we don't really need to do these checks since we know by construction that certain things are/are not satisfied. This would simplify the transforms a bit and speed up construction a bit if we don't need to do a bunch of checks. (It would also be nice if we can get rid of some of the indexing operations there in favor of just reshaping, but im not sure how that plays with symmetry where certain modes don't exist)

5. Extend notion of "meshgrid" to partial meshgrids/bases

eg, ConcentricGrid has the same rho/theta at each zeta, so its sort of a tensor product grid (with some accompanying simplifications). Similarly, a FourierZer nikeBasis is a partial tensor product.

Yes I think using partial summation techniques would be significant. Some thoughts

The given example in 3 may require #1207 first, otherwise the nodes would change in the Newton iteration while the transform won't get updated.
For 1. 2. and 3., the partial evaluation steps would be a constant factor improvement. The complexity of evaluating a function over a 3D grid is still $\mathcal{O}(N^6)$. Probably still useful
On a tensor-product grid, you can proceed from partial evaluation to partial summation mentioned in #1154 to reduce to $\mathcal{O}(N^4)$.
Some optimization metrics are 2D objectives over flux surfaces that implicitly rely on the continuity of the physics to minimize the objective throughout the volume. For these the best node set is probably a tensor product, so using partial summation will reduce the computation cost from $\mathcal{O}(LN^4) \to \mathcal{O}(LN^3)$ where $L$ is number of flux surfaces and $N$ is maximum of the number of poloidal and toroidal surfaces.
Lower priority, but some multigrid optimization only need one grid/ the transforms for that grid if some of the objectives only need to target a subset of the flux surfaces of the full grid. Right now we recreate transforms for each objective instead of using the subset of the transform on the full grid.

PlasmaControl / DESC