NanoComp / meep

free finite-difference time-domain (FDTD) software for electromagnetic simulations
GNU General Public License v2.0
1.21k stars 618 forks source link

topopt tutorial on 1d multilayer film optimization #2820

Closed stevengj closed 1 month ago

stevengj commented 5 months ago

As discussed with @oskooi, it might be nice to have a tutorial with "1d" topology optimization, i.e. designing a multilayer film. This can be done even in 2d and 3d just by making the material grid a 1x1xN grid (which can then be "extruded" over a higher-dimensional block object).

  1. For example, designing a multilayer film to minimize transmission should result in essentially a quarter-wave stack (possibly with some modification for the first and last layers). Just minimize transmission with the abovementioned 1d material grid, ramping up β slowly as usual to let it decide on the number of layers.
  2. We should also be able to fix the number of layers. If you just set β=infinity (with the new subpixel-smoothing algorithm) then the optimization will tend not to change the topology (because the derivative is not correct for changes in topology) unless it happens to take a very large step. You can probably prevent it from doing this by simply counting the number of layers (from the projected material grid) on each step and adding a large penalty to the objective function if the number of layers changes, which will cause the optimization algorithm to backtrack.

(You could also use a level-set representation if you have a differentiable mapping function that maps a set of layer thickness to a density ρ, ala #2235.)

On a separate (but related) note, we should probably update the fill_factor calculation in the subpixel smoothing code here to always compute the fill factor for a 3d spherical smoothing kernel — i.e. take the perspective that even "2d" and "1d" structures are actually 3d structures that are invariant in 1 or 2 directions, respectively.

erikshipton commented 5 months ago

Isn't this kind of like the Needle Optimization for thin films? I apologize if thats obvious and been pointed out elsewhere but Needle optimization for thin films kind of feels like topological optimization in 1D.

oskooi commented 4 months ago

Here is a sketch of the problem to be set up:

1D Multilayer Film Optimization

There are $N$ layers of alternating materials of index $n_A$ and $n_B$.

The initial problem involves a single wavelength, angle, and polarization of the incident planewave.

stevengj commented 4 months ago

Would be good to also include an version of this example that uses traditional topology optimization (with a 1d material grid), to show the pros and cons (pro: TO can choose the number of layers; con: may need explicit lengthscale constraint).

Might be simplest to make the materials at the two ends both $n_A$ or both $n_B$ (e.g. set $n_A = 1$). This effectively allows TO to also optimize the total thickness.

oskooi commented 3 months ago

Here is an initial attempt to set up the shape optimization of a multilayer stack. The objective function in this example is the transmission through a stack of 11 layers of alternating materials $n_1$, $n_2$, $n_1$, $n_2$, ..., $n_1$. This is a 1D simulation. Similar to the schematic above, a planewave is incident at normal incidence with $\mathcal{S}$ polarization. The design parameters are the 11 layer thicknesses: $t_1$, $t_2$, $t_3$, $t4$, ..., $t{11}$.

In a recent tutorial, we demonstrated how to compute the derivative of an objective function (diffraction efficiency) with respect to a single shape parameter (grating height). In this example, we extend that approach involving a custom vJP to compute the gradient of the objective function (transmission through the stack) with respect to multiple parameters (layer thicknesses). This involves a 1D MaterialGrid which is used to generate a smoothed levelset-representation of the multilayer stack. An actual profile of the stack geometry for the initial design is shown below. The 11 layers are clearly noticeable.

multilayer_stack_levelset

The adjoint gradient computed by Meep for this design looks reasonable.

gradient_wrt_smoothed_design_weights

However, the directional derivative computed using the adjoint gradient and the finite differences are off by a factor of nearly two. One problem could be that the transmission values may be too small: ~2e-6. As an alternative, we can try using the reflection as the objective function rather than the transmission.

obj_val_unperturbed[0] = 2.2321483940402576e-06
obj_val_perturbed[0] = 2.7330293852711764e-06
adj_directional_deriv:, 8.656551213588154e-07
fnd_directional_deriv:, 5.008809912309189e-07
dir-deriv:, 0.00000050 (finite difference), 0.00000087 (adjoint), 0.728265 (error)
"""Shape optimization of a multilayer stack.                                                                      

The 1D stack consists of alternating materials of index N_LAYER_1 and N_LAYER_2                                   
in the arrangement: N_LAYER_1, N_LAYER_2, N_LAYER_1, N_LAYER_2, ..., N_LAYER_1.                                   

The design parameters are the N layer thicknesses: [t_1, t_2, ..., t_N].                                          
"""

from typing import Callable

from autograd.extend import primitive, defvjp
from autograd import numpy as npa
from autograd import tensor_jacobian_product
from mpl_toolkits.axes_grid1 import make_axes_locatable
import matplotlib.pyplot as plt
import meep as mp
import meep.adjoint as mpa
import numpy as np

RESOLUTION_UM = 200
WAVELENGTH_UM = 1.0
AIR_UM = 1.0
PML_UM = 1.0
NUM_LAYERS = 11
MIN_LAYER_UM = 0.1
MAX_LAYER_UM = 0.9
N_LAYER_1 = 3.5
N_LAYER_2 = 1.0 # TODO (oskooi): support arbitrary use case (not just air).                                       
LAYER_PERTURBATION_UM = 1.0 / RESOLUTION_UM
DESIGN_REGION_RESOLUTION_UM = 10 * RESOLUTION_UM

DEBUG_OUTPUT = True

design_region_size = mp.Vector3(0, 0, NUM_LAYERS * MAX_LAYER_UM)
nz_design_grid = int(design_region_size.z * DESIGN_REGION_RESOLUTION_UM) + 1
nz_sim_grid = int(design_region_size.z * RESOLUTION_UM) + 1
num_unit_cells = int(0.5 * (NUM_LAYERS - 1))

def design_region_to_grid(nz: int) -> np.ndarray:
    """Returns the 1D coordinates of the grid for the design region.                                              

    Args:                                                                                                         
      nz: number of grid points.                                                                                  

    Returns:                                                                                                      
      The coordinates of the grid points.                                                                         
    """

    z_grid = np.linspace(
    -0.5 * design_region_size.z,
    +0.5 * design_region_size.z,
    nz
    )

    return z_grid

@primitive
def levelset_and_smoothing(layer_thickness_um: np.ndarray) -> np.ndarray:
    """Returns the density weights for a multilayer stack as a levelset.                                          

    Args:                                                                                                         
      layer_thickness_um: thickness of each layer in the stack.                                                   

    Returns:                                                                                                      
      The density weights as a flattened (1D) array.                                                              
    """

    air_padding_um = 0.5 * (design_region_size.z - np.sum(layer_thickness_um))

    weights = np.zeros(nz_design_grid)
    z_grid = design_region_to_grid(nz_design_grid)

    # Air padding at left edge.                                                                                   
    z_start = 0
    z_end = int(air_padding_um * DESIGN_REGION_RESOLUTION_UM)
    weights[z_start:z_end] = 0

    z_start = z_end
    for j in range(NUM_LAYERS):
        z_end = z_start + int(layer_thickness_um[j] * DESIGN_REGION_RESOLUTION_UM)
        weights[z_start:z_end] = 1 if (j % 2 == 0) else 0
        z_start = z_end

    # Air padding at right edge.                                                                                  
    weights[z_start:] = 0

    # Smooth the design weights by downsampling from the design grid                                              
    # to the simulation grid using bilinear interpolation.                                                        
    z_sim_grid = design_region_to_grid(nz_sim_grid)
    smoothed_weights = np.interp(z_sim_grid, z_grid, weights)

    if DEBUG_OUTPUT:
        fig, ax = plt.subplots()
        ax.plot(z_sim_grid, smoothed_weights, 'b-')
        ax.plot(z_grid, weights, 'r-')
        ax.set_xlabel("z ($\mu$m)")
        ax.set_ylabel("design weights (smoothed)")
        if mp.am_master():
            fig.savefig(
                "multilayer_stack_levelset.png",
                dpi=150,
                bbox_inches="tight"
            )

    return smoothed_weights.flatten()

def levelset_and_smoothing_vjp(
        ans: np.ndarray,
        layer_thickness_um: np.ndarray
) -> Callable[[np.ndarray], np.ndarray]:
    """Returns a function for computing the vector-Jacobian product."""

    air_padding_um = 0.5 * (design_region_size.z - np.sum(layer_thickness_um))

    jacobian = np.zeros((nz_sim_grid, NUM_LAYERS))

    z_grid = design_region_to_grid(nz_design_grid)
    z_sim_grid = design_region_to_grid(nz_sim_grid)

    for i in range(NUM_LAYERS):
        weights = np.zeros(nz_design_grid)

        # Air padding at left edge.                                                                               
        z_start = 0
        z_end = int(air_padding_um * DESIGN_REGION_RESOLUTION_UM)
        weights[z_start:z_end] = 0

        z_start = z_end
        for j in range(NUM_LAYERS):
            layer_um = layer_thickness_um[j]
            if j == i:
                layer_um += LAYER_PERTURBATION_UM
            z_end = z_start + int(layer_um * DESIGN_REGION_RESOLUTION_UM)
            weights[z_start:z_end] = 1 if (j % 2 == 0) else 0
            z_start = z_end

        # Air padding at right edge.                                                                              
        weights[z_start:] = 0

        # Smooth the design weights by downsampling from the design grid                                          
        # to the simulation grid using bilinear interpolation.                                                    
        smoothed_weights = np.interp(z_sim_grid, z_grid, weights)

        jacobian[:, i] = (smoothed_weights - ans) / LAYER_PERTURBATION_UM

    if DEBUG_OUTPUT:
        fig, ax = plt.subplots()
        im = ax.imshow(
            np.transpose(jacobian),
            cmap="inferno",
            interpolation="none",
            aspect="equal",
        )
        ax.set_title(r"$\partial \rho_{smooth-levelset} / \partial t$")
        divider = make_axes_locatable(ax)
        cax = divider.append_axes("right", size="5%", pad=0.05)
        fig.colorbar(im, cax=cax)
        if mp.am_master():
            fig.savefig(
                "multilayer_stack_gradient.png",
                dpi=150,
                bbox_inches="tight"
            )

    return lambda g: np.tensordot(g, jacobian, axes=1)

def input_flux() -> float:
    """Returns the flux generated by the source."""

    frequency = 1 / WAVELENGTH_UM
    pml_layers = [mp.PML(direction=mp.Z, thickness=PML_UM)]

    size_z_um = PML_UM + AIR_UM + design_region_size.z + AIR_UM + PML_UM
    cell_size = mp.Vector3(0, 0, size_z_um)

    src_cmpt = mp.Ex
    src_pt = mp.Vector3(0, 0, -0.5 * size_z_um + PML_UM)
    sources = [
        mp.Source(
            mp.GaussianSource(frequency, fwidth=0.1 * frequency),
            component=src_cmpt,
            center=src_pt,
        )
    ]

    sim = mp.Simulation(
        resolution=RESOLUTION_UM,
        cell_size=cell_size,
        dimensions=1,
        boundary_layers=pml_layers,
        sources=sources
    )

    tran_pt = mp.Vector3(0, 0, 0.5 * size_z_um - PML_UM)

    flux_mon = sim.add_flux(
        frequency,
        0,
        1,
        mp.FluxRegion(center=tran_pt, size=mp.Vector3())
    )

    sim.run(
        until_after_sources=mp.stop_when_fields_decayed(
            25.0, src_cmpt, tran_pt, 1e-6
        )
    )

    flux = mp.get_fluxes(flux_mon)[0]

    return flux

def multilayer_stack(norm_flux: float) -> mpa.OptimizationProblem:
    """Sets up the adjoint optimization of a multilayer stack.                                                    

    Args:                                                                                                         
      norm_flux: input flux to use for normalizing the objective function.                                        

    Returns:                                                                                                      
      A meep.adjoint.Optimization object for the simulation.                                                      
    """

    frequency = 1 / WAVELENGTH_UM
    pml_layers = [mp.PML(direction=mp.Z, thickness=PML_UM)]

    size_z_um = PML_UM + AIR_UM + design_region_size.z + AIR_UM + PML_UM
    cell_size = mp.Vector3(0, 0, size_z_um)

    src_cmpt = mp.Ex
    src_pt = mp.Vector3(0, 0, -0.5 * size_z_um + PML_UM)
    sources = [
        mp.Source(
            mp.GaussianSource(frequency, fwidth=0.1 * frequency),
            component=src_cmpt,
            center=src_pt,
        )
    ]

    mat_1 = mp.Medium(index=N_LAYER_1)
    mat_2 = mp.Medium(index=N_LAYER_2)

    matgrid = mp.MaterialGrid(
        mp.Vector3(0, 0, nz_sim_grid),
        mat_1,
        mat_2,
        weights=np.ones(nz_sim_grid),
        do_averaging=False
    )

    matgrid_region = mpa.DesignRegion(
        matgrid,
        volume=mp.Volume(
            center=mp.Vector3(),
            size=design_region_size
        ),
    )

    geometry = [
        mp.Block(
            material=matgrid,
            size=matgrid_region.size,
            center=matgrid_region.center
        )
    ]

    sim = mp.Simulation(
        resolution=RESOLUTION_UM,
        cell_size=cell_size,
        dimensions=1,
        boundary_layers=pml_layers,
        sources=sources,
        geometry=geometry
    )

    tran_pt = mp.Vector3(0, 0, 0.5 * size_z_um - PML_UM)

    obj_args = [
        mpa.FourierFields(
            sim,
            mp.Volume(center=tran_pt),
            mp.Ex
        ),
        mpa.FourierFields(
            sim,
            mp.Volume(center=tran_pt),
            mp.Hy
        )
    ]

    def obj_func(dft_ex, dft_hy):
        return npa.real(npa.conj(dft_ex) * dft_hy) / norm_flux

    opt = mpa.OptimizationProblem(
        simulation=sim,
        objective_functions=obj_func,
        objective_arguments=obj_args,
        design_regions=[matgrid_region],
        frequencies=[frequency]
    )

    return opt

if __name__ == "__main__":
    layer_thickness_um = np.array(
        [0.3, 0.1, 0.2, 0.1, 0.6, 0.4, 0.2, 0.1, 0.4, 0.3, 0.2]
    )

    smoothed_design_weights = levelset_and_smoothing(layer_thickness_um)

    norm_flux = input_flux()

    opt = multilayer_stack(norm_flux)

    obj_val_unperturbed, grad_unperturbed = opt(
        [smoothed_design_weights], need_gradient=True
    )
    print(f"obj_val_unperturbed[0] = {obj_val_unperturbed[0]}")

    defvjp(levelset_and_smoothing, levelset_and_smoothing_vjp)
    grad_backprop = tensor_jacobian_product(levelset_and_smoothing, 0)(
        layer_thickness_um,
        grad_unperturbed
    )

    if DEBUG_OUTPUT:
        fig, ax = plt.subplots()
        ax.plot(grad_unperturbed)
        ax.set_title(r"$\partial F / \partial \rho_{smooth-levelset}$")
        if mp.am_master():
            fig.savefig(
                "gradient_wrt_smoothed_design_weights.png", dpi=150, bbox_inches="tight"
            )

    perturbation_um = 1e-3 * np.ones(NUM_LAYERS)
    perturbed_design_weights = levelset_and_smoothing(
        layer_thickness_um + perturbation_um
    )
    perturbed_design_weights = perturbed_design_weights.flatten()

    obj_val_perturbed, _ = opt(
        [perturbed_design_weights],
        need_gradient=False
    )
    print(f"obj_val_perturbed[0] = {obj_val_perturbed[0]}")

    adj_directional_deriv = (perturbation_um[None, :] @ grad_backprop)[0]
    fnd_directional_deriv = obj_val_perturbed[0] - obj_val_unperturbed[0]
    print(f"adj_directional_deriv:, {adj_directional_deriv}")
    print(f"fnd_directional_deriv:, {fnd_directional_deriv}")
    rel_err = abs(
        (fnd_directional_deriv - adj_directional_deriv) /
        fnd_directional_deriv
    )
    print(
        f"dir-deriv:, {fnd_directional_deriv:.8f} (finite difference), "
        f"{adj_directional_deriv:.8f} (adjoint), {rel_err:.6f} (error)"
    )
stevengj commented 3 months ago

In implementing the custom vJp, when you change a single layer thickness parameter $t_k$, it perturbs the locations of all/many of the layer interfaces, so it is correct for many interfaces to show up in the vJp.

Right now, the optimization problem is probably a bit too trivial — if you have enough layers of lossless material, then even random layers will have nearly 100% reflection and nearly 0% transmission. Since you are nearly at the theoretical upper limits, the gradients will be nearly zero (and hard to check) and the optimizer probably won't do much.

It would be good to make the problem more interesting. One simple thing would be to add some loss to both of the design materials (in the design region only), so that a random structure will have significant absorption. Then, maximizing reflection will try to perturb the layers in a nontrivial way (probably approximating a quarter-wave stack) in order to reflect the light in as short a distance as possible to minimize reflection. (This will also make the gradients nonzero and hence easier to check.)

oskooi commented 2 months ago

Right now, the optimization problem is probably a bit too trivial — if you have enough layers of lossless material, then even random layers will have nearly 100% reflection and nearly 0% transmission. Since you are nearly at the theoretical upper limits, the gradients will be nearly zero (and hard to check) and the optimizer probably won't do much.

Two modifications to the original setup were necessary to make the gradients nonzero: (1) reducing the number of layers (NUM_LAYERS) from 11 to 5 and (2) reducing the index contrast (N_LAYER_1) from 3.5 to 1.3. I also used a layer thickness of 0.2 μm for all layers rather than random values. With these changes, the transmittance is ~0.11 rather than 5e-7.

The adjoint gradient matches the finite-difference result using this setup but only when the adjoint gradient is multiplied by a factor of 0.5. Perhaps this is a bug in the adjoint gradient calculation in 1D?

adj_directional_deriv = 0.5 * (perturbation_um[None, :] @ grad_backprop)[0]

Anyhow, with this change, the relative error of the adjoint gradient converges to zero with first-order accuracy as expected.

levelset_gradient_backpropagation_1D

levelset_adjoint_gradient_error_vs_resolution_1D

multilayer_stack_weights

gradient_wrt_smoothed_design_weights