Closed ddudt closed 2 months ago
On a related note, I'm skeptical that we don't need to use the coil current constraints for finite-beta equilibria. Using the final W7-X solution in the coil_stage_two_optimization.ipynb
tutorial for example:
grid = LinearGrid(rho=np.array([1.0]), M=eq.M_grid, N=eq.N_grid, NFP=eq.NFP, sym=eq.sym)
linking_current = 2 * np.pi * eq.compute("G", grid=grid)["G"][0] / mu_0
print(linking_current) # -8.2e7 A
print(sum(optimized_coilset2.coils[1].current)) # -3.0e5 A
This coil set has relatively low field errors and that is still two orders of magnitude off. Maybe if the $B\cdot\hat{n}$ errors were perfectly zero it would work, but it appears that in practice the linking current constraint is not implicitly satisfied.
not really sure what you mean by
factorize_linear_constraints
normalizes the optimization variablesx_reduced
to order unity,
not really sure what you mean by
factorize_linear_constraints
normalizes the optimization variablesx_reduced
to order unity,
factorize_linear_constraints
uses the "scaled" versions of the objectives to compute the linear system of equations:
A = constraint.jac_scaled(x0)
b = -constraint.compute_scaled_error(x0)
For example, say we have two coils each with a current of 1e6 Amps, and we want to constraint the sum of their currents:
coils = CoilSet.linspaced_angular(FourierPlanarCoil(1e6), n=2)
objective = ObjectiveFunction(CoilLength(coils))
constraint = ObjectiveFunction(FixSumCoilCurrent(coils))
objective.build()
constraint.build()
xp, A, b, Z, unfixed_idx, project, recover = factorize_linear_constraints(objective, constraint)
This yields the following:
xp = [1e6, 1e6]
A = [1e-6, 1e-6]
b = [2]
Z = [-0.707; 0.707]
Then the initial optimization variables in this case will be x_reduced = Z.T @ (x - xp) = [0, 0]
, and are "normalized" in the sense that they are small numbers instead of order 1e6. This is problematic since the optimizer will then take step sizes relative to the scale of x_reduced
, but that yields changes in the currents on the order of Amps when we want changes on the order of MA.
The "real" parameters are recovered as x = xp + Z @ x_reduced
. The null space matrix Z
is always order unity by construction, so the only way for the optimizer to take large step sizes relative to the true scale of the variables is if the optimization variables x_reduced
have the same order of magnitude as the real parameters x
. In this example, that means we either need x_reduced
to be order 1e6 or we can normalize the units of the real parameters x
.
I think this might be as simple as redefining things as
x = xp + Z @ D @ x_reduced
x_reduced = D^-1 @ Z.T @ (x - xp)
or
x = xp + D @ Z @ x_reduced
x_reduced = Z.T @ D^-1 @ (x - xp)
where D
is some diagonal scaling matrix. In the first case D
corresponds to the reduced variables, in the second case to the full variables, you can sort of convert between them like we do in the perturbations: https://github.com/PlasmaControl/DESC/blob/5b245a477df1caaacf6eb06212f4d7f710ea56e2/desc/perturbations.py#L285-L287
I may have found an additional wrinkle to this problem, or else I'm not understanding something.
As the simplest possible example, say we have a CoilSet
with 2 coils of type FourierRZCoil
with resolution n=0. So each coil only has 3 optimizable parameters: R_n
, Z_n
, and current
(all scalars in this example). The full state vector for the coil set is then 6 parameters in the following order:
x = [R1_n, Z1_n, current1, R2_n, Z2_n, current2]
The linear constraint matrix corresponding to the FixSumCoilCurrent
objective is simply a single row such that A @ x
represents current1 + current2
:
A = np.array([[0, 0, 1., 0, 0, 1.]])
Then the null space matrix that determines the optimization variables is the following matrix of shape (6, 5):
Ainv, Z = svd_inv_null(A)
print(Z)
[[ 0. -0.70710678 0. 0. -0.70710678]
[ 1. 0. 0. 0. 0. ]
[ 0. 0.5 0. 0. -0.5 ]
[ 0. 0. 1. 0. 0. ]
[ 0. 0. 0. 1. 0. ]
[ 0. -0.5 0. 0. 0.5 ]]
It is apparent that the coil currents current1
and current2
are functions of the optimization variables x_reduced[1]
and x_reduced[4]
. This is odd, because with 2 degrees of freedom and 1 linear constraint, they should be functions of a single variable. Even more strange, the parameter R1_n
is also a function of these same two optimization variables when it should have no relation to the coil currents.
Interestingly, if we were to redefine the order of the full state vector such that the coil current is listed first, we get the following result:
# x = [current1, R1_n, Z1_n, current2, R2_n, Z2_n]
A = np.array([[1., 0, 0, 1., 0, 0]])
Ainv, Z = svd_inv_null(A)
print(Z)
[[ 0. 0. -0.70710678 0. 0. ]
[ 1. 0. 0. 0. 0. ]
[ 0. 1. 0. 0. 0. ]
[ 0. 0. 0.70710678 0. 0. ]
[ 0. 0. 0. 1. 0. ]
[ 0. 0. 0. 0. 1. ]]
This makes much more sense: the two coil currents are functions of a single optimization variable, and all of the curve parameters are independent variables.
Why does the order of our state vector matter, and how can we consistently get the behavior we want? Is this an artifact of the SVD algorithm? Is there a solution besides changing the hard-coded order of the optimizable parameters? Please help.
In general the SVD is unique up to a permutation of the columns of $U$ and $V$. If we enforce that the singular values are ordered in decreasing magnitude, this gives a unique decomposition as long as there are no repeated singular values. However in this case, we expect there to be duplicates (ie, there are 5 singular values at 0 because the null space is 5D). In that case it's not unique, we're effectively free to rotate the subspace spanned by the repeated singular vectors any way we like, so if we permute the columns of $A$ the resulting $Z$ will generally not just be a simple permutation of the original $Z$ (though there is some orthogonal matrix that connects the two).
That said, can't we just apply any scaling to the full state vector so that the order of the reduced vector doesn't matter (assuming we scale everything to O(1))
OK that probably explains it. But I still don't think this can be completely solved by scaling the variables. Because in my example above, the original ordering of the state vector results in an unphysical optimization vector x_reduced
even if the null space matrix is mathematically correct. I mean "unphysical" in the sense that the coil currents were not independent of the geometric curve parameters.
Also I can't figure out how to weight the full state vector instead of the reduced variables. Because A @ D @ Z = 0
is not generally true, while A @ Z @ D = 0
will always be satisfied. So in your previous post, I needed to go with the first option instead of the second option, with D
corresponding to x_reduced
. But if we can rescale the full state vector to O(1) then yes, I think that would help solve this problem.
Redefine x
as D@x
and then do Z as null space of A@D
instead of A
Why does the order of our state vector matter, and how can we consistently get the behavior we want?
Maybe this is different than your original scaling problem but can we add a logic to do what you want like this
A = np.array([[0, 0, 1., 0, 0, 1.]])
# This for general loop, actually for this case we can just use the A for v0
# Find the row of A which has two non-zero elements
# (we need to change this to find 2 ones in a row)
v0 = A[np.where(np.count_nonzero(A, axis=1) == 2)[0][0]]
idx_second1 = np.where(v0 == 1)[0][1]
# set second 1 as -1 to satify A@v0=0
v0[idx_second1] = -1
# normalize v0
v0 = v0 / np.linalg.norm(v0)
# delete the column of A which has the second 1
A = np.delete(A, idx_second1, axis=1)
Ainv, Z = svd_inv_null(A)
# we have to add a zero row to Z to make it compatible with the original A
Z = np.vstack((Z, np.zeros((1, Z.shape[1]))))
Z = np.hstack((Z, v0[:, np.newaxis]))
print(Z)
So, in factorize_linear_constraints
we take the fixed parameters out manually. This code is trying to do find one basis vector of the null space manually. I think we can even generalize it for multiple ones on a row. I was also looking at the factorize_linear_constraints
function to make Poincare BC branch more stable. If you guys want, I can work on something like this. Though this is basically what svd_inv_null
is doing on the background?
@dpanici has mentioned this issue offline, but I want to document it:
The linear constraint
FixSumCoilCurrent
does what it is intended to do, but when used in an optimization does not allow the coil currents to change much relative to their nominal values on the order of 1e6 Amps. I have noticed that using the nonlinear constraintToroidalFlux
allows the coil currents to change much more and gives better results in practice.When the linear constraint is included,
factorize_linear_constraints
normalizes the optimization variablesx_reduced
to order unity, while the particular solutionxp
is order 1e6 so that the full state vectorx_full = xp + Z @ x_reduced
has the proper magnitude in Amperes. The null space matrixZ
is also order unity, so even large changes in the optimization space result in very small changes to the true coil currents.A hacky solution would be to redefine the units of the coil currents to be in MA instead of Amps. Alternatively, it would be nice if the rows of
Z
were scaled proportionally toxp
, such that relative changes in the optimization variablesx_reduced
yield similar relative changes in the full state vectorx_full
.