Closed oskooi closed 3 months ago
Recall that our level set requires a filter step (or that you transform your grid into a signed distance function) in order to work properly. Then you must internally set β=∞.
I may have missed it, but it looks like you are simply populating the grid with a binary array.
See https://github.com/NanoComp/meep/discussions/2067 for a SDF.
Unfortunately, work on this issue is blocked by the bug in #2226 (comment) involving an incorrect mode (an oblique planewave) as computed by the Newton-method solver via MPB in fields::get_eigenmode
.
@oskooi check out this comment too when you are back to looking at this. It describes essentially what you want to do (as a basic example).
We are finally able to return to this issue with the use of #2285 and #2767. I have modified the example from #2054 (comment) of computing the adjoint gradient of a 1D grating to specify the grating structure as a level set using the signed-distance function from #2067 (map_to_signed_distance_function
in the script below). We are also using subpixel smoothing of the MaterialGrid
via do_averaging=True
and beta=np.inf
.
@smartalecH: to validate the adjoint gradient of the level set using the brute-force finite difference, should we just perturb one of the geometric parameters of the grating such as the height or duty cycle? This would be a more useful validation than simply applying a random perturbation to the entire design region as we have been doing in all of our previous examples involving a density-based representation of the design region.
from enum import Enum
from typing import Tuple
from autograd.extend import primitive, defvjp
from autograd import numpy as anp
from autograd import tensor_jacobian_product
import matplotlib.pyplot as plt
import meep as mp
import meep.adjoint as mpa
RESOLUTION_UM = 100
WAVELENGTH_UM = 0.5
GRATING_PERIOD_UM = 2.5
PADDING_UM = 3.0
Polarization = Enum("Polarization", "S P")
def mapping(
weights: anp.ndarray,
eta: float,
beta: float,
filter_radius_um: float,
design_region_size: mp.Vector3,
design_region_resolution: float
) -> anp.ndarray:
"""A differentiable mapping function for the design weights."""
# Note: weights should be a 2d array. If it is 1d, it will be reshaped
# to 2d by conic_filter. There is no need to reshape it here.
filtered_field = mpa.conic_filter(
weights,
filter_radius_um,
design_region_size.x,
design_region_size.y,
design_region_resolution,
)
if beta == 0:
return filtered_field.flatten()
else:
projected_field = mpa.smoothed_projection(
filtered_field,
beta,
eta,
design_region_resolution,
)
return projected_field.flatten()
def design_region_to_meshgrid(
design_region_size: mp.Vector3,
nx: int,
ny: int
) -> Tuple[anp.ndarray, anp.ndarray]:
"""Returns the meshgrid coordinate arrays for the design region."""
xcoord = anp.linspace(
-0.5*design_region_size.x,
+0.5*design_region_size.x,
nx
)
ycoord = anp.linspace(
-0.5*design_region_size.y,
+0.5*design_region_size.y,
ny
)
xv, yv = anp.meshgrid(xcoord, ycoord, indexing='ij')
return xv, yv
@primitive
def grating_levelset(
grating_height_um: float,
grating_duty_cycle: float,
design_region_size: mp.Vector3,
nx: int,
ny: int
) -> anp.ndarray:
"""Returns the weights for the grating as a 1D array."""
xv, yv = design_region_to_meshgrid(design_region_size, nx, ny)
weights = anp.where(
anp.abs(yv) <= 0.5 * grating_duty_cycle * GRATING_PERIOD_UM,
anp.where(
xv <= xv[0][0] + grating_height_um,
1,
0
),
0
)
return weights.flatten()
def grating_levelset_vjp(
ans,
grating_height_um: float,
grating_duty_cycle: float,
design_region_size: mp.Vector3,
nx: int,
ny: int
) -> anp.ndarray:
"""Returns the vector-Jacobian product."""
xv, yv = design_region_to_meshgrid(design_region_size, nx, ny)
# pixel dimensions
delta_x = design_region_size.x / (nx - 1)
delta_y = design_region_size.y / (ny - 1)
# Gradient of the level set with respect to the grating height.
# The gradient is obtained using a finite difference. The gradient is
# therefore a zero matrix with non-zero entries only at the locations
# of the height boundary. Since the height boundary is normal to the
# x axis, these non-zero matrix elements have a value of 1 / Δx.
weights = anp.where(
anp.abs(yv) <= 0.5 * grating_duty_cycle * GRATING_PERIOD_UM,
anp.where(
xv <= xv[0][0] + grating_height_um + 0.5 * delta_x,
anp.where(
xv >= xv[0][0] + grating_height_um - 0.5 * delta_x,
1 / delta_x,
0
),
0
),
0
)
jacobian = weights.flatten()
return lambda g: anp.tensordot(g, jacobian, axes=1)
def grating_1d(
pol: Polarization,
grating_height_um: float,
design_region_size: mp.Vector3,
nx: int,
ny: int
) -> mpa.OptimizationProblem:
"""Sets up the adjoint optimization of a 1D grating."""
frequency = 1 / WAVELENGTH_UM
substrate_um = 3.0
pml_um = 1.0
pml_layers = [mp.PML(direction=mp.X, thickness=pml_um)]
n_glass = 1.5
glass = mp.Medium(index=n_glass)
size_x_um = pml_um + substrate_um + grating_height_um + PADDING_UM + pml_um
size_y_um = GRATING_PERIOD_UM
cell_size = mp.Vector3(size_x_um, size_y_um, 0)
k_point = mp.Vector3()
if pol.name == "S":
eig_parity = mp.ODD_Z
src_cmpt = mp.Ez
else:
eig_parity = mp.EVEN_Z
src_cmpt = mp.Hz
src_pt = mp.Vector3(-0.5 * size_x_um + pml_um, 0, 0)
sources = [
mp.Source(
mp.GaussianSource(frequency, fwidth=0.1 * frequency),
component=src_cmpt,
center=src_pt,
size=mp.Vector3(0, size_y_um, 0),
)
]
matgrid = mp.MaterialGrid(
mp.Vector3(nx, ny),
mp.air,
glass,
weights=anp.ones((nx, ny)),
do_averaging=False,
)
matgrid_region = mpa.DesignRegion(
matgrid,
volume=mp.Volume(
center=mp.Vector3(
(-0.5 * size_x_um + pml_um + substrate_um +
0.5 * design_region_size.x),
0,
0
),
size=design_region_size,
),
)
geometry = [
mp.Block(
material=glass,
size=mp.Vector3(pml_um + substrate_um, mp.inf, mp.inf),
center=mp.Vector3(
-0.5 * size_x_um + 0.5 * (pml_um + substrate_um), 0, 0
),
),
mp.Block(
material=matgrid,
size=matgrid_region.size,
center=matgrid_region.center,
),
]
sim = mp.Simulation(
resolution=RESOLUTION_UM,
cell_size=cell_size,
boundary_layers=pml_layers,
k_point=k_point,
sources=sources,
geometry=geometry,
)
tran_pt = mp.Vector3(0.5 * size_x_um - pml_um, 0, 0)
order_y = 1
kdiff = mp.Vector3(
(frequency**2 - (order_y / GRATING_PERIOD_UM)**2)**0.5,
order_y / GRATING_PERIOD_UM,
0
)
print(f"kdiff = ({kdiff.x:.5f}, {kdiff.y:.5f}, {kdiff.z:.5f})")
obj_list = [
mpa.EigenmodeCoefficient(
sim,
mp.Volume(
center=tran_pt,
size=mp.Vector3(0, size_y_um, 0),
),
mode=1,
kpoint_func=lambda *not_used: kdiff,
eig_parity=eig_parity,
eig_vol=mp.Volume(
center=tran_pt,
size=mp.Vector3(0, 1 / RESOLUTION_UM, 0)
),
),
]
def J(tran_mon):
return anp.abs(tran_mon)**2
opt = mpa.OptimizationProblem(
simulation=sim,
objective_functions=J,
objective_arguments=obj_list,
design_regions=[matgrid_region],
frequencies=[frequency],
)
return opt
if __name__ == "__main__":
grating_height_um = 0.5
grating_duty_cycle = 0.5
height_perturbation_um = 1 / RESOLUTION_UM
design_region_size = mp.Vector3(
grating_height_um + PADDING_UM,
GRATING_PERIOD_UM,
0
)
design_region_resolution_um = int(2 * RESOLUTION_UM)
nx = int(design_region_size.x * design_region_resolution_um) + 1
ny = int(design_region_size.y * design_region_resolution_um) + 1
# Properties of the convolution filter and projection operator.
beta = 0.2
eta_intrinsic = 0.5
eta_erosion = 0.75
min_feature_size_um = 0.5
filter_radius_um = mpa.get_conic_radius_from_eta_e(
min_feature_size_um,
eta_erosion
)
opt = grating_1d(
Polarization.P,
grating_height_um,
design_region_size,
nx,
ny,
)
unperturbed_design_weights = grating_levelset(
grating_height_um,
grating_duty_cycle,
design_region_size,
nx,
ny,
)
mapped_design_weights = mapping(
unperturbed_design_weights,
eta_intrinsic,
beta,
filter_radius_um,
design_region_size,
design_region_resolution_um,
)
obj_value_unperturbed, grad_unperturbed = opt(
[mapped_design_weights],
need_gradient=True,
)
# Backpropagate the adjoint gradient through two functions in the correct order.
grad_backprop = tensor_jacobian_product(mapping, 0)(
mapped_design_weights,
eta_intrinsic,
beta,
filter_radius_um,
design_region_size,
design_region_resolution_um,
grad_unperturbed,
)
defvjp(grating_levelset, grating_levelset_vjp)
grad_backprop = tensor_jacobian_product(grating_levelset, 0)(
grating_height_um,
grating_duty_cycle,
design_region_size,
nx,
ny,
grad_backprop,
)
fig, ax = plt.subplots()
opt.plot2D(init_opt=False, ax=ax)
if mp.am_master():
fig.savefig(
'grating_1d_plot2D.png', dpi=150, bbox_inches='tight'
)
perturbed_design_weights = grating_levelset(
grating_height_um + height_perturbation_um,
grating_duty_cycle,
design_region_size,
nx,
ny,
)
mapped_design_weights = mapping(
perturbed_design_weights,
eta_intrinsic,
beta,
filter_radius_um,
design_region_size,
design_region_resolution_um,
)
obj_value_perturbed, _ = opt(
[mapped_design_weights],
need_gradient=False
)
adj_dirderiv = height_perturbation_um * grad_backprop
fnd_dirderiv = obj_value_perturbed[0] - obj_value_unperturbed[0]
rel_err = abs(adj_dirderiv - fnd_dirderiv) / fnd_dirderiv
print(
f"dirderiv:, {fnd_dirderiv} (finite difference), "
f"{adj_dirderiv} (adjoint), {rel_err:.6f} (error)"
)
We are also using subpixel smoothing of the MaterialGrid via do_averaging=True and beta=np.inf.
Frankly, I would use the smoothed projection feature we recently implemented in python (rather than what we tried to implement in c++). We know there are bugs in 3D, I imagine there are bugs in 2D too.
should we just perturb one of the geometric parameters of the grating such as the height or duty cycle?
Yes that should work fine. You may have to play with how big the step needs to be.
Frankly, I would use the smoothed projection feature we recently implemented in python
Why is it necessary to apply a smoothing projection to the design weights when they are already smoothed using the signed-distance function?
Separately, I have updated the script in my previous comment to compute the adjoint gradient of the diffraction efficiency with respect to the height of the grating. However, the adjoint gradient back propagated through the signed-distance function using Autograd's tensor_jacobian_product
is zero which is incorrect:
directional-deriv:, [-0.01232668] (finite difference), 0.0 (adjoint)
I suspect this is because the signed-distance function implemented in the function map_to_signed_distance_function
uses SciPy's skfmm.distance
function which is not differentiable using Autograd. Do we therefore need to implement our own signed-distance function using Autograd to make this work?
Why is it necessary to apply a smoothing projection to the design weights when they are already smoothed using the signed-distance function?
So keep in mind that the underlying framework using the material grid assumes a density based approach to TO. In order to turn it into a level set, you need to do subpixel smoothing, and our modified projection function lets us do that easily without leaving the realm of density-based TO.
If you already have a level-set function that does smoothing for you, and you can backprop through it, then you don't need any extra smoothing (and you should turn off both the smoothing and set beta=0 in the material grid).
The challenge is that we don't always have level set functions available to us. For trivial geometries, like squares or circles it's easy. But if you are trying to evolve something more difficult, (like starting with a simple waveguide crossing and doing shape optimization) then using the smoothed projection function makes this easy.
which is not differentiable using Autograd.
I would just write a custom level set for this. Rectangles are easy and you can do it in native autograd. Lots of resources online.
If you already have a level-set function that does smoothing for you, and you can backprop through it, then you don't need any extra smoothing (and you should turn off both the smoothing and set beta=0 in the material grid).
I have modified the script to disable the internal subpixel smoothing in MaterialGrid
and instead apply a filter and projection via conic_filter
and the new smoothing_projection
functions to the level set directly:
1. unperturbed grating
# Obtain the grating as a level set.
unperturbed_design_weights = grating_weights(
grating_height_um,
grating_duty_cycle,
design_region_size,
nx,
ny,
)
# Apply a filter and projection to the level set.
mapped_unperturbed_design_weights = mapping(
unperturbed_design_weights,
eta_intrinsic,
npa.inf,
filter_radius_um,
design_region_size,
design_region_resolution_um,
)
obj_value_unperturbed, grad_unperturbed = opt(
[mapped_unperturbed_design_weights],
need_gradient=True,
)
2. grating with perturbed height
# Obtain the grating as a level set.
height_perturbation_um = 1e-2
perturbed_design_weights = grating_weights(
grating_height_um + height_perturbation_um,
grating_duty_cycle,
design_region_size,
nx,
ny,
)
# Apply a filter and projection to the level set.
mapped_perturbed_design_weights = mapping(
perturbed_design_weights,
eta_intrinsic,
npa.inf,
filter_radius_um,
design_region_size,
design_region_resolution_um,
)
obj_value_perturbed, _ = opt(
[mapped_perturbed_design_weights],
need_gradient=False
)
Now it seems we just need to backpropagate twice: the first time through the mapping
function and then through the grating_weights
function:
def mapping(
weights: npa.ndarray,
eta: float,
beta: float,
filter_radius_um: float,
design_region_size: mp.Vector3,
design_region_resolution: float
) -> np.ndarray:
"""A differentiable mapping function for the design weights."""
filtered_field = mpa.conic_filter(
weights,
filter_radius_um,
design_region_size.x,
design_region_size.y,
design_region_resolution,
)
if beta == 0:
return filtered_field.flatten()
else:
projected_field = mpa.smoothed_projection(
filtered_field,
beta,
eta,
design_region_resolution,
)
return projected_field.flatten()
Is this the correct approach?
You want to skip the conic filter, which moves your level set.
You want to skip the conic filter, which moves your level set.
There is a comment in the docstrings of the smoothed_projection
function which states that the input to this function must have been already filtered:
@smartalecH: what will happen if we remove the conic_filter
step in the mapping
function and thus only apply smoothed_projection
to the level set?
@oskooi the only assumption for this new projection function is that you have a level set and that it's smoothly varying.
You don't appear to have that. You just have binary shapes derived from fundamental parameters.
So in your case, you do need to filter. But you would also need to map this ad hoc levelset to your new shape using eg a marching squares algorithm.
It would be much easier to simply use a signed distance function of a rectangle. Lots of resources online like this one. Offset the zero levelset to be 0.5 (our standard eta value) and use the new smoothed projection function to do the proper smoothing. You should be able to code it up in autograd.
(To be clear, the main application of this smoothed projection is for topology optimization, i.e. when the density/level-set function itself comprises the degrees of freedom.)
As a test to verify that the gradient calculation has been set up correctly, I would like to demonstrate that we can compute a non-zero gradient of the diffraction efficiency with respect to the grating height. (The gradient is incorrect until we have implemented the signed-distance function using Autograd and used it to replace the conic_filter
in the mapping
function.)
For now, applying a mapping
consisting of a conic_filter
followed by a smoothed_projection
to a levelset (full script is above) and then trying to back propagate the gradient through the functions mapping
and then grating_levelset
always produces a zero gradient along with this warning from autograd:
/python3.10/site-packages/autograd/tracer.py:14: UserWarning: Output seems independent of input.
warnings.warn("Output seems independent of input.")
grad_backprop (grating_levelset):, 0.0, ()
directional-deriv:, [-0.00056425] (finite difference), 0.0 (adjoint)
It is unclear why the gradient of the levelset function is always zero. @smartalecH, any thoughts?
Looks like the problem is that Autograd does not support gradients of numpy.meshgrid
(HIPS/autograd#457) which is currently used by the levelset construction function:
def grating_levelset(
grating_height_um: float,
grating_duty_cycle: float,
design_region_size: mp.Vector3,
nx: int,
ny: int
) -> npa.ndarray:
"""Returns the weights for the grating as a 1D array."""
xcoord = npa.linspace(
-0.5*design_region_size.x,
+0.5*design_region_size.x,
nx
)
ycoord = npa.linspace(
-0.5*design_region_size.y,
+0.5*design_region_size.y,
ny
)
xv, yv = npa.meshgrid(xcoord, ycoord, indexing='ij')
weights = npa.where(
npa.abs(yv) <= 0.5 * grating_duty_cycle * GRATING_PERIOD_UM,
npa.where(
xv <= xcoord[0] + grating_height_um,
1,
0
),
0
)
return weights.flatten()
Also, it does not seem to be an issue but Autograd does support gradients of np.abs
which is not normally differentiable.
The fix is that we therefore just need to rewrite grating_levelset
and replace np.meshgrid
.
Here is an easier approach:
This is feasible because your number of parameters p is very small, and F(f(p)) is very fast (no PDE solve), so finite differences are cheap. Just make sure that the image is high enough resolution that finite differences are sufficiently accurate.
(Your f(p) function above, i.e. grating_levelset
, has a derivative zero almost everywhere, because an infinitesimal perturbation doesn't change the image. To fix that you would need to do subpixel smoothing within that function. However, with my suggested finite-difference approach, you can just use finite differences that change the image by at least one pixel, and the problem goes away.)
write a custom vJp routine for F(f(p)) that just uses finite differences to obtain the Jacobian matrix.
This seems to be working now after I added a custom VJP function for the levelset construction function (shown below). We can now demonstrate good agreement in the directional derivative computed using the adjoint gradient and finite differences:
dirderiv:, 0.00161151 (finite difference), 0.00191024 (adjoint), 0.18537272 (error)
These results were obtained using a resolution of 200 pixels/μm and a runtime of nearly 1.5 hours using a single Xeon 4.2 GHz. Unfortunately, using finite differences to approximate the Jacobian necessarily requires large resolutions to obtain accurate results. This feature has already been noted in the comments above:
Just make sure that the image is high enough resolution that finite differences are sufficiently accurate.
As a check, I cranked up the resolution to 400 pixels/μm and the relative error decreased. However, this run took nearly 12 hours with 4 Xeon cores.
dirderiv:, 0.00110571 (finite difference), 0.00118250 (adjoint), 0.06944858 (error)
Details of the calculation involving composition of the objective function and computing the Jacobian matrix using finite differences are summarized in the slide below. The current calculation does not use subpixel smoothing (which will be added next by extending this approach to the signed-distance function in order to replace the conic filter in the mapping
function).
def grating_levelset_vjp(
ans,
grating_height_um: float,
grating_duty_cycle: float,
design_region_size: mp.Vector3,
nx: int,
ny: int
) -> np.ndarray:
"""Returns the vector-Jacobian product."""
xv, yv = design_region_to_meshgrid(design_region_size, nx, ny)
# pixel dimensions
delta_x = design_region_size.x / (nx - 1)
delta_y = design_region_size.y / (ny - 1)
# Gradient of the level set with respect to the grating height.
# The gradient is obtained using a finite difference. The gradient is
# therefore a zero matrix with non-zero entries only at the locations
# of the height boundary. Since the height boundary is normal to the
# x axis, these non-zero matrix elements have a value of 1 / Δx.
weights = np.where(
np.abs(yv) <= 0.5 * grating_duty_cycle * GRATING_PERIOD_UM,
np.where(
xv <= xv[0][0] + grating_height_um + 0.5 * delta_x,
np.where(
xv >= xv[0][0] + grating_height_um - 0.5 * delta_x,
1 / delta_x,
0
),
0
),
0
)
jacobian = weights.flatten()
return lambda g: np.tensordot(g, jacobian, axes=1)
Using the same approach involving a custom VJP described in my previous comment applied to the signed-distance function via the module scikit-fmm
and computing the adjoint gradient of a level-set representation of the 1D grating using the subpixel-smoothing feature of the MaterialGrid
class (rather than the alternative approach proposed by @smartalecH of replacing the tanh_projection
operation of the mapping
function with smoothed_projection
) seems to produce the expected agreement with the brute-force result obtained using finite differences.
Here are the results comparing the directional derivatives obtained using the two methods (finite difference vs. adjoint gradient) for the $\mathcal{S}$ and $\mathcal{P}$ polarization at a resolution of 200 pixels/μm.
S polarization
directional-deriv:, -0.00146690 (finite difference), -0.00128812 (adjoint), 0.121877 (error)
P polarization
directional-deriv:, 0.00021602 (finite difference), 0.00136521 (adjoint), 5.319892 (error)
There is good agreement for the $S$ polarization but that is the "easy" case for a 2D simulation because the electric field ($E_z$) is always parallel to the material interfaces. The relative error for the $\mathcal{P}$-polarization case, however, is more than an order of magnitude larger than the $\mathcal{S}$-polarization case. The accuracy of the $\mathcal{P}$ polarization but not $\mathcal{S}$ is improved by increasing the grid resolution to 400 pixels/μm:
S polarization
directional-deriv:, -0.00058979 (finite difference), -0.00043476 (adjoint), 0.262865 (error)
P polarization
directional-deriv:, 0.00056589 (finite difference), 0.00098577 (adjoint), 0.741962 (error)
The script used to generate the results is available in this gist. It contains two key parts:
MaterialGrid
object and $\beta = \infty$. matgrid = mp.MaterialGrid(
mp.Vector3(nx, ny),
mp.air,
glass,
weights=anp.ones((nx, ny)),
do_averaging=True,
beta=anp.inf
)
def signed_distance(weights: anp.ndarray) -> anp.ndarray:
"""Maps the 2D weights using a signed-distance function."""
# Create signed distance function.
sd = skfmm.distance(weights - 0.5)
# Linear interpolation of zero-levelset onto 0.5-levelset.
xp = [anp.min(sd.flatten()), 0, anp.max(sd.flatten())]
yp = [0, 0.5001, 1]
return anp.interp(sd.flatten(), xp, yp)
@primitive
def levelset_and_smoothing(
grating_height_um: float,
grating_duty_cycle: float,
design_region_size: mp.Vector3,
nx: int,
ny: int
) -> anp.ndarray:
"""Generates the grating as a levelset with signed-distance smoothing."""
xv, yv = design_region_to_meshgrid(design_region_size, nx, ny)
weights = anp.where(
anp.abs(yv) <= 0.5 * grating_duty_cycle * GRATING_PERIOD_UM,
anp.where(
xv <= xv[0][0] + grating_height_um,
1,
0
),
0
)
return signed_distance(weights)
def levelset_and_smoothing_vjp(
ans: anp.ndarray,
grating_height_um: float,
grating_duty_cycle: float,
design_region_size: mp.Vector3,
nx: int,
ny: int
) -> anp.ndarray:
"""Computes the vector-Jacobian product."""
xv, yv = design_region_to_meshgrid(design_region_size, nx, ny)
# pixel dimensions
delta_x = design_region_size.x / (nx - 1)
weights = anp.where(
anp.abs(yv) <= 0.5 * grating_duty_cycle * GRATING_PERIOD_UM,
anp.where(
xv <= xv[0][0] + grating_height_um + delta_x,
1,
0
),
0
)
jacobian = (signed_distance(weights) - ans) / delta_x
return lambda g: anp.tensordot(g, jacobian, axes=1)
This figure shows the smoothed grating obtained by applying the signed-distance function to its level-set representation.
Related to #2011.
This is an initial attempt to use the new subpixel smoothing feature of the adjoint solver to compute the gradient of a structure parameterized by its geometry (a level set) as an alternative to the conventional density-based (
MaterialGrid
) representation. The example involves computing the gradient of the diffraction efficiency of the m=+1 transmitted order of a 1D binary grating with respect to its height and fill factor. The results are to be validated by the usual method of the brute-force gradient computed using a finite difference. Once everything is working correctly, this demonstration will be converted into a tutorial for the user manual and a unit test.The main feature of this calculation is a differentiable wrapper function (
grating_matgrid
in the script below) that takes the grating height and fill factor as arguments and returns the weights of aMaterialGrid
for the binary grating. These weights are then passed to the adjoint solver via aDesignRegion
object in the usual way and the gradient which is computed is then backpropagated through the wrapper function.An image of the geometry created using this approach confirms that the structure is set up correctly. However, it seems there is a bug in the current setup because the final gradient is zero. Also, there is an important warning from autograd:
This message suggests there is a likely bug in the backpropagation step.
output of running the script below