NanoComp / meep

free finite-difference time-domain (FDTD) software for electromagnetic simulations
GNU General Public License v2.0
1.21k stars 618 forks source link

Tutorial for shape optimization of a multilayer stack #2862

Closed oskooi closed 1 month ago

oskooi commented 3 months ago

Closes #2820.

This PR adds a new tutorial for the adjoint solver which demonstrates shape optimization of a multilayer stack in 1D. The design problem involves finding the layer thicknesses (for a fixed number of layers) which minimize the transmittance of a planewave at normal incidence in air with $\lambda = 1.0$ μm through a stack of alternating materials of $n_1=1.0$ and $n_2=1.3$. The layers are arranged as $n_2$, $n_1$, $n_2$, $n_1$, ..., $n_2$. The analytic solution is a quarter-wavelength stack with thicknesses $\lambda / (4 n_1) = 0.25$ and $\lambda / (4 n_2) = 0.1923$.

The optimization uses the BFGS algorithm of NLopt. An initial value of 0.2 μm is used for the layer thicknesses. Results are shown below for five different cases in which the number of layers are 5, 7, 9, 11, and 13. The quarter-wavelength stack configuration is found by the optimization only for layer numbers 5 and 7. Layer numbers 9, 11, and 13 seem to get trapped in a local minima. In all cases, the optimization terminates due to the threshold tolerance of the objective function (return code 3 of NLopt). The thickness of all layers is constrained to be in the range of 0.1 μm to 0.5 μm. Since these are 1D simulations, the runtimes are generally less than a couple of minutes using a single core (Intel Xeon i7-7700K CPU @ 4.20GHz).

A plot of the transmittance vs. iteration number for the five cases of the layer numbers shows that the transmittance decreases with the number of layers, as expected. The optimizer seems to move the design in the wrong direction (increasing transmittance) during the early iterations before eventually finding an optimal design which minmizes the transmittance.

We may want to consider creating a unit test based on this tutorial since currently there is no unit test for the 1D adjoint solver.

1. NUM_LAYERS = 5 (~quarter-wavelength stack!)

objective value: 0.14224969749140354 (objective value)
layer thicknesses: [0.18767195 0.2477044  0.19506915 0.24954383 0.19187953] (layer thicknesses)

2. NUM_LAYERS = 7 (quarter-wavelength stack!)

objective value: 0.09740488016708304
layer thicknesses: [0.19177897 0.25412021 0.19592259 0.24779526 0.18806041 0.24928526 0.19152152]

3. NUM_LAYERS = 9

objective value: 0.06522007836855086
layer thicknesses: [0.20360086 0.21794387 0.21536126 0.22216203 0.21716955 0.22252077 0.21613793 0.21944627 0.20599552]

4. NUM_LAYERS = 11

objective value: 0.0415530392321388
layer thicknesses: [0.20432596 0.21336318 0.21564032 0.21854779 0.21839713 0.21969996 0.2186816  0.21923049 0.21665256 0.21505099 0.20660389]

5. NUM_LAYERS = 13

objective value: 0.025748647829910033
layer thicknesses: [0.20378588 0.21309369 0.21486671 0.21798861 0.21754488 0.21913944 0.21813383 0.21946564 0.21819879 0.2189978  0.21607473 0.21496185 0.20625602]

opt_transmittance_vs_numlayers

codecov-commenter commented 3 months ago

:warning: Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 73.78%. Comparing base (f29a8c7) to head (eca296d). Report is 26 commits behind head on master.

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #2862 +/- ## ========================================== - Coverage 73.81% 73.78% -0.04% ========================================== Files 18 18 Lines 5423 5428 +5 ========================================== + Hits 4003 4005 +2 - Misses 1420 1423 +3 ```
oskooi commented 3 months ago

It seems that indeed the optimizer is getting stuck in local optima. For the case with 9 layers shown above, I replaced the initial design which originally used a fixed layer thickness of 0.2 μm with random thicknesses. Running with initial random designs did eventually produce optimal designs which were close to the quarter-wavelength stack. Results are shown below for five different runs. Note the large spread in the transmittance values of the optimal designs.

modification

rng = np.random.RandomState()
layer_thickness_um = (MIN_LAYER_UM + rng.rand(NUM_LAYERS) *
                      (MAX_LAYER_UM - MIN_LAYER_UM))

optimal_layer_thickness_um = solver.optimize(layer_thickness_um)

results

RUN 1
optimal_obj_val:, 0.07488352304439169
optimal_layer_thickness_um:, [0.19735612, 0.28425176 0.5, 0.30282801, 0.19073322, 0.299095640.5, 0.29265727 0.19515679]

RUN 2 (has features of quarter-wavelength stack)
optimal_obj_val:, 0.06882352172464842
optimal_layer_thickness_um:, [0.5, 0.29139132, 0.19141724, 0.24581283, 0.19472315, 0.25271642, 0.1912731,  0.25638825 0.19021278]

RUN 3
optimal_obj_val:, 0.07496775161676676
optimal_layer_thickness_um:, [0.1896674, 0.24165775, 0.19786316, 0.3092328,  0.5, 0.2881851, 0.20043806, 0.28663803, 0.5       ]

RUN 4
optimal_obj_val:, 0.08787575553075631
optimal_layer_thickness_um:, 0.50000, 0.34762, 0.50000, 0.34346, 0.50000, 0.29508, 0.19783, 0.29944, 0.50000

RUN 5 (has features of quarter-wavelength stack)
optimal_obj_val:, 0.07478894666528571
optimal_layer_thickness_um:, 0.50000, 0.33898, 0.50000, 0.30020, 0.19164, 0.24854, 0.19684, 0.24210, 0.19353
stevengj commented 2 months ago

For lots of layers, you can easily get a non-quarter-wave structure with very low transmission, and once you get to that point the gradients will be almost zero so probably it will converge very slowly towards a QW solution (it's very slow to try to reduce transmission from 1e-3 to 1e-6, for example).

@oskooi suggested just taking the log of the transmission. This seems worth trying, but I'm skeptical — it seems like a free lunch? It wouldn't help if there is noise in the gradient that is limiting us at low transmission levels, for example.

Another alternative is simply to minimize ∫|E|² in the mirrors (or similar), which will push the optimization towards a field that decays as quickly as possible. Intuitively, this uses more information than the transmission optimization, since it uses E at every point in the mirror rather than just at the end.

PS. Note that L-BFGS, being a higher-order scheme, will be more vulnerable to errors in the gradient (e.g. discretization errors) compared to first-order methods like CCSA.

oskooi commented 2 months ago

I tried the two suggested objective functions: (1) log of the transmission and (2) ∫|E|² including taking its log. Unfortunately, neither approach seemed to produce better results than the original objective function of just the transmission.

Based on these results, I think the choice of objective function may not be as important as simply running lots of trials with random initial designs in order to try to find the global optima of the quarter-wavelength stack. The design space seems to be highly non convex regardless of the objective function. We may want to consider switching to a different problem.

*1. Objective Function: $\log(E^{} \times H)$**

    tran_pt = mp.Vector3(0, 0, 0.5 * size_z_um - PML_UM)

    obj_args = [
        mpa.FourierFields(sim, mp.Volume(center=tran_pt), mp.Ex),
        mpa.FourierFields(sim, mp.Volume(center=tran_pt), mp.Hy),
    ]

    def obj_func(dft_ex, dft_hy):
        """Objective function for the optimization.

        Args:
          dft_ex, dft_hy: the discrete Fourier-transformed Ex and Hy fields.

        Returns:
          The transmittance through the stack.
        """

        return npa.log(npa.real(npa.conj(dft_ex) * dft_hy) / norm_flux)

2. Objective Function: ∫|E|²

    obj_args = [
        mpa.FourierFields(
            sim,
            matgrid_region.volume,
            mp.Ex
        )
    ]

    def obj_func(dft_ex):
        """Objective function for the optimization.

        Args:
          dft_ex: the discrete Fourier-transformed Ex field.

        Returns:
          The field intensity summed over the design region.
        """

        return npa.log(npa.sum(npa.absolute(dft_ex) **2))
stevengj commented 2 months ago

When you minimize ∫|E|², did you get a more rapid spatial decay of the fields? I don't think the transmission should get better, but maybe it is more localized. Fundamentally, once ∫|E|² gets very localized, changing the exponential tail from $10^{-3}$ to $10^{-6}$ seems like a very ill-conditioned problem (the objective function is very flat in the direction you want, but there are other directions in parameter space that rapidly make things worse). So you should get a function that looks quite localized by minimizing ∫|E|², but its exponential tails (i.e. the transmission) won't go down many orders of magnitude. I suspect that it's not a local minimum, just that improving it will require (a) a huge number of optimization iterations (thanks to the ill-conditioning) and (b) a more accurate gradient (ditto).

So, if you're not happy just showing that minimizing ∫|E|² produces some rapid spatial decay, I would look at something that is going to be better conditioned. (And, at a single ω, we already know that quarter-wave stacks are a good, perhaps optimal, solution for this.). For example, you can try to design a broad-band AR coating with only 3–4 layers. In this case if you can get an optimum of > 90% at all wavelengths you would be pretty happy, and there is no worry about going from 99.9% to 99.999%. (Either worst-case or average-case broadband would be interesting, perhaps both.)

stevengj commented 2 months ago

(Note that the fact that optimization takes some steps that make things worse initially is a common phenomenon in many algorithms — initially, it takes steps that are too large and then has to backtrack. After a few iterations, the algorithm has more information about a "good" step size, and in the case of L-BFGS has more information about the 2nd derivative.)

oskooi commented 2 months ago

When you minimize ∫|E|², did you get a more rapid spatial decay of the fields?

I did verify that minimizing ∫|E|² rather than the transmittance does produce more rapid spatial decay of the fields, as expected. The plot below shows a comparison of the field intensity within the multilayer stack (the design region) for the two cases of the objective function: (1) ∫|E|² vs. (2) transmittance. The design produced using (1) had a smaller transmittance than (2). This is perhaps a good indication that (1) may be a more suitable objective function than (2) for this problem.

(I was probably unlucky in the local optima I chose for (1) in the results I posted in my previous comment because that design did not have a lower transmittance than the design obtained using (2).)

Case 1: objective function = ∫|E|²

tran:, 0.365143 (obj. func. = fields in the stack)

Case 2: objective function = transmittance

tran:, 0.564647 (obj. func. = transmittance)

field_decay_in_multilayer_stack

oskooi commented 2 months ago

For example, you can try to design a broad-band AR coating with only 3–4 layers. In this case if you can get an optimum of > 90% at all wavelengths you would be pretty happy, and there is no worry about going from 99.9% to 99.999%. (Either worst-case or average-case broadband would be interesting, perhaps both.)

I changed the original optimization problem of minimizing ∫|E|² for a single wavelength to multiple wavelengths using minimax (i.e., worst-case optimization). The updated problem involves two wavelengths (0.95 and 1.05 μm) and nine layers in the stack. Because the epigraph formulation involves the use of nonlinear constraints which are not supported by the BFGS algorithm in NLopt, I switched to using the MMA algorithm instead. The changes to the script multilayer_opt.py have been made in the latest commit (e82e8e9).

A plot of the history of the objective function vs. iteration number of the optimizer for the two wavelengths is shown below. Also included in this plot is the epigraph variable. The initial layer thicknesses are chosen randomly. The optimizer seems to converge to a local optima within ~20 iterations. In this case, the optimizer ran for the maximum number of iterations (50).

As expected, the optimal layer thicknesses are quite different from the quarter-wave stack of alternating thicknesses of 0.19 and 0.25 μm.

optimal_obj_val:, 13.719162487586791
optimal_layer_thickness_um:, [0.1664, 0.2452, 0.1399, 0.2719, 0.1786, 0.3060, 0.4952, 0.1688, 0.2306]
return_code:, 5

multilayer_opt_obj_func_history

Additional results showing the field decay in the stack as well as the transmittance for this design will be added soon.

oskooi commented 2 months ago

A plot of the $|E_x|^2$ fields in the stack (design region) for the optimal design from the previous comment is shown below along with the script used to generate this plot. The plot shows that while the DFT fields for each of the two wavelengths are decaying through the stack, the transmittance values are quite different: T($\lambda$ = 0.95 μm) = 0.3140 vs. T($\lambda$ = 1.05 μm) = 0.7353. This result is consistent with the plot of the objective function vs. iteration number which shows that, because of the minimax setup, the stack is being optimized for $\lambda$ = 0.95 μm and not $\lambda$ = 1.05 μm. Also, the magnitude of $|E_x|^2$ at the right edge of the stack is clearly larger for $\lambda$ = 1.05 μm than $\lambda$ = 0.95 μm. This is (again) consistent with the transmittance being larger for $\lambda$ = 1.05 than for $\lambda$ = 0.95 μm.

Perhaps we just happened to choose a poor local optima for this demonstration? Or maybe this is a feature of this particular optimization problem?

field_decay_in_multilayer_stack_broadband

rom typing import List, Tuple

import matplotlib.pyplot as plt
import meep as mp
import numpy as np

RESOLUTION_UM = 800
AIR_UM = 1.0
PML_UM = 1.0
NUM_LAYERS = 9
MIN_LAYER_UM = 0.1
MAX_LAYER_UM = 0.5
N_LAYER_1 = 1.0
N_LAYER_2 = 1.3
LAYER_PERTURBATION_UM = 1.0 / RESOLUTION_UM
DESIGN_WAVELENGTHS_UM = (0.95, 1.05)
DESIGN_REGION_RESOLUTION_UM = 10 * RESOLUTION_UM
DESIGN_REGION_UM = mp.Vector3(0, 0, NUM_LAYERS * MAX_LAYER_UM)
NZ_DESIGN_GRID = int(DESIGN_REGION_UM.z * DESIGN_REGION_RESOLUTION_UM) + 1
NZ_SIM_GRID = int(DESIGN_REGION_UM.z * RESOLUTION_UM) + 1

num_wavelengths = len(DESIGN_WAVELENGTHS_UM)
frequencies = [1 / wavelength_um for wavelength_um in DESIGN_WAVELENGTHS_UM]

frequency_center = np.mean(frequencies)

# Set source bandwidth to be larger than the range of design wavelengths.                                         
frequency_width = 1.2 * (np.max(frequencies) - np.min(frequencies))

def design_region_to_grid(nz: int) -> np.ndarray:
    """Returns the coordinates of the grid for the design region.                                                 

    Args:                                                                                                         
      nz: number of grid points.                                                                                  

    Returns:                                                                                                      
      The 1D coordinates of the grid points.                                                                      
    """

    z_grid = np.linspace(
        -0.5 * DESIGN_REGION_UM.z,
        +0.5 * DESIGN_REGION_UM.z,
        nz
    )

    return z_grid

def levelset_and_smoothing(layer_thickness_um: np.ndarray) -> np.ndarray:
    """Returns the density weights for a multilayer stack as a levelset.                                          

    Args:                                                                                                         
      layer_thickness_um: thickness of each layer in the stack.                                                   

    Returns:                                                                                                      
      The density weights as a flattened (1D) array.                                                              
    """

    air_padding_um = 0.5 * (DESIGN_REGION_UM.z - np.sum(layer_thickness_um))

    weights = np.zeros(NZ_DESIGN_GRID)

    # Air padding at left edge                                                                                    
    z_start = 0
    z_end = int(air_padding_um * DESIGN_REGION_RESOLUTION_UM)
    weights[z_start:z_end] = 0

    z_start = z_end
    for j in range(NUM_LAYERS):
        z_end = z_start + int(layer_thickness_um[j] * DESIGN_REGION_RESOLUTION_UM)
        weights[z_start:z_end] = 1 if (j % 2 == 0) else 0
        z_start = z_end

    # Air padding at right edge                                                                                   
    z_end = z_start + int(air_padding_um * DESIGN_REGION_RESOLUTION_UM)
    weights[z_start:z_end] = 0

    # Smooth the design weights by downsampling from the design grid                                              
    # to the simulation grid using bilinear interpolation.                                                        
    z_sim_grid = design_region_to_grid(NZ_SIM_GRID)
    z_design_grid = design_region_to_grid(NZ_DESIGN_GRID)
    smoothed_weights = np.interp(z_sim_grid, z_design_grid, weights)

    return smoothed_weights.flatten()

def input_flux() -> np.ndarray:
    """Returns the flux generated by the source in vacuum."""

    pml_layers = [mp.PML(thickness=PML_UM)]

    size_z_um = PML_UM + AIR_UM + DESIGN_REGION_UM.z + AIR_UM + PML_UM
    cell_size = mp.Vector3(0, 0, size_z_um)

    src_cmpt = mp.Ex
    src_pt = mp.Vector3(0, 0, -0.5 * size_z_um + PML_UM)
    sources = [
        mp.Source(
            mp.GaussianSource(frequency_center, fwidth=frequency_width),
            component=src_cmpt,
            center=src_pt,
        )
    ]

    sim = mp.Simulation(
        resolution=RESOLUTION_UM,
        cell_size=cell_size,
        dimensions=1,
        boundary_layers=pml_layers,
        sources=sources
    )

    tran_pt = mp.Vector3(0, 0, 0.5 * size_z_um - PML_UM)

    flux_mon = sim.add_flux(
        frequencies,
        mp.FluxRegion(center=tran_pt)
    )

    sim.run(
        until_after_sources=mp.stop_when_fields_decayed(
            25.0, src_cmpt, tran_pt, 1e-6
        )
    )

    source_flux = mp.get_fluxes(flux_mon)

    return source_flux

def multilayer_stack(
        input_flux: np.ndarray,
        layer_thickness_um: np.ndarray
) -> Tuple[np.ndarray, np.ndarray]:
    """Returns the DFT fields of a multilayer stack."""

    pml_layers = [mp.PML(thickness=PML_UM)]

    size_z_um = PML_UM + AIR_UM + DESIGN_REGION_UM.z + AIR_UM + PML_UM
    cell_size = mp.Vector3(0, 0, size_z_um)

    src_cmpt = mp.Ex
    src_pt = mp.Vector3(0, 0, -0.5 * size_z_um + PML_UM)
    sources = [
        mp.Source(
            mp.GaussianSource(frequency_center, fwidth=frequency_width),
            component=src_cmpt,
            center=src_pt,
        )
    ]

    mat_1 = mp.Medium(index=N_LAYER_1)
    mat_2 = mp.Medium(index=N_LAYER_2)

    weights = levelset_and_smoothing(layer_thickness_um)
    matgrid = mp.MaterialGrid(
        mp.Vector3(0, 0, NZ_SIM_GRID),
        mat_1,
        mat_2,
        weights
    )

    geometry = [
        mp.Block(
            material=matgrid,
            center=mp.Vector3(),
            size=DESIGN_REGION_UM
        )
    ]

    sim = mp.Simulation(
        resolution=RESOLUTION_UM,
        cell_size=cell_size,
        dimensions=1,
        boundary_layers=pml_layers,
        sources=sources,
        geometry=geometry
    )

    dft_field_mon = sim.add_dft_fields(
        [mp.Ex],
        frequencies,
        center=mp.Vector3(),
        size=DESIGN_REGION_UM,
        yee_grid=True
    )

    tran_pt = mp.Vector3(0, 0, 0.5 * size_z_um - PML_UM)

    flux_mon = sim.add_flux(
        frequencies,
        mp.FluxRegion(center=tran_pt)
    )

    sim.run(
        until_after_sources=mp.stop_when_fields_decayed(
            25.0, src_cmpt, tran_pt, 1e-8
        )
    )

    ex_dft = np.zeros(
        (num_wavelengths, NZ_SIM_GRID),
        dtype=np.complex128
    )
    for j in range(num_wavelengths):
        ex_dft[j, :] = sim.get_dft_array(dft_field_mon, mp.Ex, j)

    tran_flux = mp.get_fluxes(flux_mon)
    transmittance = np.asarray(tran_flux) / np.asarray(norm_flux)

    return transmittance, ex_dft

if __name__ == "__main__":
    norm_flux = input_flux()

    layer_thickness_um = [
        0.1664, 0.2452, 0.1399, 0.2719, 0.1786, 0.3060, 0.4952, 0.1688, 0.2306
    ]
    tran, ex_dft = multilayer_stack(norm_flux, layer_thickness_um)

    markers = ['bo-', 'ro-']
    z_grid = design_region_to_grid(NZ_SIM_GRID)
    fig, ax = plt.subplots()
    for j in range(num_wavelengths):
        ax.plot(
            z_grid,
            np.absolute(ex_dft[j, :])**2,
            markers[j],
            label=(
                f"$\lambda$ = {DESIGN_WAVELENGTHS_UM[j]} μm, "
                f"T = {tran[j]:.4f}"
            )
        )
    ax.set_xlabel("z coordinate (μm)")
    ax.set_ylabel("$|E_x|^2$ in multilayer stack")
    ax.legend()
    if mp.am_master():
        fig.savefig(
            "broadband_field_decay_in_multilayer_stack.png",
            dpi=150,
            bbox_inches="tight"
        )
stevengj commented 2 months ago

For comparison, a simple hand design would be to use a quarter-wave stack at the center frequency given by the harmonic mean of the wavelengths λ₀ = 1 / mean(1 ./ [0.95, 1.05]) == 0.9975 ≈ 1, and hope that the band gap is wide enough to cover both wavelengths. It would be nice to know how well the optimization does compared to this simple design.

e.g. plot the figure of merit normalized by the q-wave values.

I would use the CCSA-quadratic (#2400) algorithm in NLopt rather than MMA. Hopefully you are just getting stuck due to a problem in the optimizer. If it is an actual local minimum you are stuck at, then you might need to try lots of random starting points (maybe reduce the number of parameters) or try an even fancier global algorithm.

oskooi commented 2 months ago

e.g. plot the figure of merit normalized by the q-wave values.

I would use the CCSA-quadratic (#2400) algorithm in NLopt rather than MMA.

I made a number of changes based on the suggestions which together seem to produce decent results:

  1. Changed the optimization algorithm from MMA to CCSA.
  2. Changed the objective function from ∫|E|² to log(∫|E|²).
  3. Used tighter bounds for the layer thicknesses based on the values for the quarter-wavelength stack.
  4. Ran the optimization problem 10 times and chose the design with the largest field decay (smallest ∫|E|²).

The two existing figures have been updated with the new results. The plot of the field decay in the stack has been updated to normalize the fields by those from the reference design of the quarter-wavelength stack. A new schematic showing the design problem has also been added.

stevengj commented 1 month ago

It would be good to double-check the quarter-wave transmission — I'm surprised that the transmission is so low, even though the index contrast is low, though I have no reason to doubt Meep.

Might be nice to just try minimizing transmission too. (A quarter-wave stack might not be completely optimal even at a single frequency because there might be a better termination for the first and last layers? I don't remember.)

oskooi commented 1 month ago

It would be good to double-check the quarter-wave transmission — I'm surprised that the transmission is so low, even though the index contrast is low, though I have no reason to doubt Meep.

I confirmed that the relatively large transmittance of ~0.3 of the quarter-wavelength stack in the example is due to the index contrast of 1.3. For comparison, the transmittance is 1e-5 for a quarter-wavelength stack with nine layers for an index contrast of 3.5. This information has been added to the tutorial in #2881.