NanoComp / photonics-opt-testbed

testbed problems for photonics optimization
MIT License
30 stars 19 forks source link

Add minimum linewidth constraint to waveguide mode converter topology optimization using Meep #27

Closed oskooi closed 2 years ago

oskooi commented 2 years ago

Adds a minimum linewidth constraint to the Meep toplogy optimization of the waveguide mode converter based on A.M. Hammond et al., Optics Express, Vol. 29, pp. 23916-23938, (2021). Thanks to @smartalecH for help in setting this up.

Here is an example for a minimum linewidth of 50 nm. The feature seems to be working although more ongoing work is necessary to improve the performance of the final design.

mode_converter_optimize_minlen50_1

optimized_mode_converter_minlen50_1_refl_tran

optimal_design_minlen50_1_eval_hist

oskooi commented 2 years ago

CSV file of design to check against ruler.

optimal_design_minlen50_1.csv

mawc2019 commented 2 years ago

CSV file of design to check against ruler.

optimal_design_minlen50_1.csv

The minimum length scale estimated by ruler is 0.0625.

oskooi commented 2 years ago

Thanks again for your additional suggestions, @smartalecH. I did the following: (1) removed the constraints on the epigraph variable t and (2) following your second recommendation, execute a single forward run before starting each optimization epoch and manually set t to the maximum value of R+1-T over all six wavelengths. I switched to a minimum length of 50 nm (from 90 nm) to see whether the length scale constraint is working properly and also to (hopefully) improve the final design performance. (A design with a 50 nm minimum length is measured to actually be 62.5 nm according to @mawc2019's ruler. This is a large discrepancy which would be good to address.)

Here are two designs generated using these modifications. Unfortunately, the final performance has gotten worse. The convergence history shows that during the start of the final epoch when the minimum linewidth constraint is applied, the performance gets significantly worse and does not change throughout the epoch. I wonder whether I need to keep adjusting the hyperparameter a1 to obtain a better result. We can also try changing the objective function to $\min\max(R,1-T)$.

    # insert epigraph variable and _no_ bounds as first element of 1d arrays                                              
    x = np.insert(x, 0, 1.2)   # initial guess for the worst error (ignored)                                            
    lb = np.insert(lb, 0, -np.inf)
    ub = np.insert(ub, 0, np.inf)

    evaluation_history = []
    cur_iter = [0]

    betas = [8, 16, 32]
    max_eval = 50
    tol = np.array([1e-6] * opt.nf)
    for beta in betas:
        solver = nlopt.opt(algorithm, n + 1)
        solver.set_lower_bounds(lb)
        solver.set_upper_bounds(ub)
        solver.set_min_objective(f)
        solver.set_maxeval(max_eval)
        solver.add_inequality_mconstraint(
            lambda r, x, g: c(r, x, g, eta_i, beta),
            tol,
        )
        # apply the minimum linewidth constraint                                                                        
        # only in the final "epoch"                                                                                     
        if beta == betas[-1]:
            solver.add_inequality_mconstraint(
                lambda r, x, g: glc(r, x, g, beta),
                [1e-8] * opt.nf,
            )

        # run a single forward run before each epoch                                                                    
        # and manually set the epigraph variable to                                                                     
        # the largest value of the objective function                                                                   
        # over the six wavelengths                                                                                      
        t0 = opt(
            [mapping(x[1:], eta_i, beta)],
            need_gradient=False,
        )
        x[0] = np.amax(t0[0])

        x[:] = solver.optimize(x)

1. a1 = 1e-4

mode_converter_optimize_minlen50_4

optimized_mode_converter_minlen50_4_refl_tran

optimal_design_minlen50_4_eval_hist

1. a2 = 1e-5

mode_converter_optimize_minlen50_5

optimized_mode_converter_minlen50_5_refl_tran

optimal_design_minlen50_5_eval_hist

smartalecH commented 2 years ago

We're getting closer...

So something now seems wrong with that first epoch. It shouldn't take 40+ iterations for it to start making progress on the FOM... Are you sure you are pulling the right value for the dummy parameter (t)?

An easy way to check what's going on is to print the value of t and the broadband FOM after each iteration. Is t really the max here? If so, why is the optimizer taking so long to work on the constraints? Is it instead trying to wiggle t around?

But the second epoch is looking better.

The convergence history shows that during the start of the final epoch when the minimum linewidth constraint is applied, the performance gets significantly worse and does not change throughout the epoch. I wonder whether I need to keep adjusting the hyperparameter a1 to obtain a better result

This is classic behavior that the hyperparameters aren't sufficiently chosen. The optimizer is struggling to satisfy the GLC (i.e. the constraint is too stiff).

Be sure to print the two values of the GLC constraints after each iteration. If they are still positive, then the optimizer is stuck trying to bring those down. It's useful to plot the gradient of the GLC constraints too.

When you do play with a1, do so by powers of 10. as $a_1\to0$, then this corresponds to a constraint independent of the dummy parameter (i.e. the optimizer won't even consider the t when trying to bring the GLC constraint functions below 0). As $a_1\to1$, the constraint starts to share the same weighting as the FOMs themselves.

(A design with a 50 nm minimum length is measured to actually be 62.5 nm according to @mawc2019's ruler. This is a large discrepancy which would be good to address.)

What is this error relative to the resolution? As things go subpixel, the ruler is only so accurate...

mawc2019 commented 2 years ago

A design with a 50 nm minimum length is measured to actually be 62.5 nm according to @mawc2019's ruler. This is a large discrepancy which would be good to address.

The results of the method are affected by implementation details. I have other versions based on cv2 and scipy.ndimage, which are packages that can perform morphological transformations. The result obtained with these versions is 56.25 nm. However, cv2 does not support 3d calculation, and scipy.ndimage is costly for 3d even if the size of the 3d array is not large. Because no genuine 3d design patterns are involved in this testbed project, maybe we should change ruler.py to the version based on cv2 or scipy.ndimage?

What is this error relative to the resolution? As things go subpixel, the ruler is only so accurate...

The resolution is 100, which means the pixel size is 10 nm. So the error 62.5-50=12.5 nm is slightly above the size of one pixel. I think such error is not quite unexpected.

stevengj commented 2 years ago

run a single forward run before each epoch and manually set the epigraph variable to the largest value of the objective function over the six wavelengths

Note that it is always better for CCSA to start with a feasible starting point (one which satisfies the constraints). (Technically, it is not guaranteed to converge otherwise, although we do attempt to minimize the constraints to reach the feasible region.) In the case of an epigraph formulation, this means that you typically want to start each epoch with a dummy variable that equals the min/max of the functions you are maximizing/minimizing.

oskooi commented 2 years ago

So something now seems wrong with that first epoch. It shouldn't take 40+ iterations for it to start making progress on the FOM... Are you sure you are pulling the right value for the dummy parameter (t)?

The range of the objective function $F=R+(1-T)$ is [0,2.0] (i.e., min: $R=0$, $T=1$, max: $R=1$, $T=0$). Based on this, it would seem that the optimizer should drive the epigraph variable $t$ to lie in the range [Δ,2.0+Δ] where Δ is a small number (e.g., 0.1) in order to satisfy the constraint $F_{\max}(\lambda) - t < 0$.

However, what I observe in my runs is that when the bound on $t$ is removed (lb = np.insert(lb, -np.inf), ub = np.insert(ub, +np.inf) in the script): (1) $t$ becomes negative during certain iterations of the first epoch and (2) in the final epoch when glc is activated, $t$ > 5.0 and remains practically unchanged regardless of the value of the glc hyperparameter a1 (varied from 1e-3 to 1e-6). A negative $t$ is a clear violation of the constraint. Note that we are forcing the initial $t$ of each of the three epochs to satisfy the constraint. In this case, these values are 1.20, 0.33, 0.11 which decreasing.

There does not seem to be a bug in the setup of the minimax optimization using the epigraph formulation. Yet the results from (1) and (2) are unexpected.

mawc2019 commented 2 years ago

So something now seems wrong with that first epoch. It shouldn't take 40+ iterations for it to start making progress on the FOM... Are you sure you are pulling the right value for the dummy parameter (t)?

This behavior does not seem to be related to the dummy variable, but related to the small and incorrect adjoint gradients. As mentioned here, the adjoint gradient is small and incorrect at the uniform structure, which is x = 0.5*np.ones(Nx*Ny) in that test example. If the optimization starts from a random structure, the flat curve does not appear in the initial stage, as shown below. image

smartalecH commented 2 years ago

As mentioned https://github.com/NanoComp/photonics-opt-testbed/issues/22#issuecomment-1267321469, the adjoint gradient is small and incorrect at the uniform structure, which is x = 0.5np.ones(NxNy)

Like we've discussed before, the current subpixel averaging code for material grids grows increasingly ill-conditioned as the norm of the spatial gradient of the material grid approaches zero (which is the case here, when u=0.5 everywhere).

We should turn off subpixel smoothing for the material grid here. You aren't taking advantage of its features anyway.

mawc2019 commented 2 years ago

As mentioned here, I turned off the smoothing, but the adjoint gradient was still incorrect.

oskooi commented 2 years ago

Here is a plot for a typical run of the objective function $F_{\max}(\lambda)$ and epigraph variable $t$ as a function of the iteration number. Each epoch consists of 50 iterations. Something is clearly wrong.

Epoch 1 and 2

optimal_design_minlen50_11_eval_hist

Epoch 3: apply minimum linewidth constraint

optimal_design_minlen50_11_eval_hist_b