Problems in Waveguide Mode Converter

mawc2019 commented 2 years ago

When the current master branch of Meep is used to run mode_converter_meep.py, Signal: Segmentation fault (11) occurs. Additionally, there is Signal code: Invalid permissions (2) if the code is run with one process, and there is Signal code: Address not mapped (1) if the code is run with two processes.

When v1.23.0 is used, such errors do not appear. Based on this version of Meep, the worst-case reflection and transmission for the two given design patterns are as follows.

# Worst-case reflection (dB), Worst-case transmission (dB)
-61.76, -0.52
-69.12, -0.40

mawc2019 commented 2 years ago

Unfortunately, when the EigenmodeCoefficient adjoint solver in v1.23.0 is applied to this case, the finite-difference and adjoint gradients are inconsistent, even if decay_by=1e-12 is imposed. If eig_parity = mp.ODD_Z is changed to eig_parity = mp.NO_PARITY, the finite-difference and adjoint gradients become consistent. However, changing eig_parity from mp.ODD_Z to mp.NO_PARITY alters the transmission significantly, which makes the code fail to produce the expected results. Any suggestions, @oskooi ?

The example code is as follows.

import meep as mp
import meep.adjoint as mpa
import numpy as np
from autograd import numpy as npa
from matplotlib import pyplot as plt

resolution = 20          # pixels/μm
w = 0.4          # waveguide width
l = 3.0          # waveguide length (on each side of design region)
dpad = 0.6       # padding length above/below design region
dpml = 1.0       # PML thickness
dx, dy = 1.6, 1.6         # length and width of design region

sx, sy = dpml+l+dx+l+dpml, dpml+dpad+dy+dpad+dpml
cell_size = mp.Vector3(sx,sy,0)
pml_layers = [mp.PML(thickness=dpml)]

# pulsed source center frequency and bandwidth
wvl_min, wvl_max = 1.26, 1.30
frq_min, frq_max = 1/wvl_max, 1/wvl_min
fcen, df = 0.5*(frq_min+frq_max), frq_max-frq_min

eig_parity = mp.ODD_Z

src_pt = mp.Vector3(-0.5*sx+dpml,0)
sources = [mp.EigenModeSource(src=mp.GaussianSource(fcen,fwidth=df),
                              size=mp.Vector3(0,sy),center=src_pt,eig_band=1,eig_parity=eig_parity)]

SiO2,Si = mp.Medium(index=1.5),mp.Medium(index=3.5)

Nx,Ny = int(resolution*dx),int(resolution*dy)
design_variables = mp.MaterialGrid(mp.Vector3(Nx, Ny), SiO2, Si, grid_type="U_MEAN")
design_region = mpa.DesignRegion(design_variables, volume=mp.Volume(center=mp.Vector3(), size=mp.Vector3(dx,dy)))
geometry = [mp.Block(size=mp.Vector3(sx,w),center=mp.Vector3(),material=Si),
            mp.Block(size=design_region.size,center=design_region.center,material=design_variables)]

sim = mp.Simulation(resolution=resolution,default_material=SiO2,cell_size=cell_size,sources=sources,k_point=mp.Vector3(),
                    geometry=geometry,boundary_layers=pml_layers)

tran_pt = mp.Vector3(0.5*sx-dpml-0.5*l)
emc_right = mpa.EigenmodeCoefficient(sim,mp.Volume(center=tran_pt,size=mp.Vector3(0,sy,0)),2,eig_parity=eig_parity)
ob_list = [emc_right]

def J(emc): return npa.abs(emc)**2

opt = mpa.OptimizationProblem(simulation=sim,objective_functions=J,objective_arguments=ob_list,
                              design_regions=[design_region],fcen=fcen,df = 0,nf = 1,decay_by=1e-12)

x = 0.5*np.ones(Nx*Ny)
opt.update_design([x])
f0, dJ_dn = opt()

db,choose = 1e-4,3
np.random.seed(20220505)

g_discrete, idx = opt.calculate_fd_gradient(num_gradients=choose,db=db)
g_discrete = np.array(g_discrete).flatten()
g_adjoint = dJ_dn.flatten()

print("Randomly selected indices: ",idx)
print("Finite-difference gradient: ",g_discrete)
print("Adjoint gradient: ",g_adjoint[idx])

plt.plot(g_discrete,g_adjoint[idx],"ro")
plt.plot(g_discrete,g_discrete,"g-",label="y=x")
plt.xlabel("Finite-difference gradient")
plt.ylabel("Adjoint gradient")
plt.legend();

oskooi commented 2 years ago

I am also seeing the segmentation fault error in EigenModeSource of python/tests/test_diffracted_planewave.py in NanoComp/meep#2226. I suspect the cause of this error is not related to Meep since nothing in fields::get_eigenmode has changed recently. (This unit test passes on my local machine with Meep built from the master branch.)

oskooi commented 2 years ago

@mawc2019, would try modifying one line in mode_converter_meep.py and see whether the error goes away:

change

https://github.com/NanoComp/photonics-opt-testbed/blob/b44524f657d29bf00dfc50f9c3f2ba3b32c7143a/waveguide_mode_converter/mode_converter_meep.py#L45

to

sources = [mp.EigenModeSource(mp.GaussianSource(fcen,fwidth=df),

src is not a keyword argument in the EigenModeSource class object constructor and so needs to be removed.

mawc2019 commented 2 years ago

src is not a keyword argument in the EigenModeSource class object constructor and so needs to be removed.

src is a keyword in EigenModeSource as shown here. Besides, in many built-in tests, the keyword src is used in the same way.

would try modifying one line in mode_converter_meep.py and see whether the error goes away

The error remains. Instead, if I put import numpy as np after import meep as mp, i.e., using

import meep as mp
import numpy as np

rather than the opposite, the error goes away. I remember I saw the same situation previously, but neglected it after finding import meep should precede import numpy. That being said, this sequence issue alone does not cause error, but may cause error in the presence of EigenModeSource.

However, even if the segmentation fault is temporarily avoided, the issue in the adjoint solver mentioned above remains in the current master branch.

oskooi commented 2 years ago

Does the segmentation fault still occur if you run the script using a single process (mpirun -np 1 python3 mode_converge_meep.py) or the serial version of Meep?

mawc2019 commented 2 years ago

The segmentation fault occurs when I run the script using python mode_converge_meep.py, mpirun -np 1 python mode_converge_meep.py, or more processes on the parallel version of Meep. It also occurs on the serial version of Meep that was compiled without --with-mpi.

stevengj commented 2 years ago

If eig_parity = mp.ODD_Z is changed to eig_parity = mp.NO_PARITY, the finite-difference and adjoint gradients become consistent.

This just changes the mode numbering. The first and second odd-z modes are (I think?) the first and third modes overall, so if you change to mp.NO_PARITY then you need to change the mode number.

By not changing the mode number and switching to mp.NO_PARITY, you are probably changing the polarization of the mode, so maybe the gradients are working for one polarization but not the other?

smartalecH commented 2 years ago

Try turning material grid smoothing off

design_variables = mp.MaterialGrid(mp.Vector3(Nx, Ny), SiO2, Si, grid_type="U_MEAN",do_averaging=False)

mawc2019 commented 2 years ago

Using the example above, the test results with or without do_averaging=False are largely the same.

With do_averaging=False:

Finite-difference gradient:  [-2.02378476e-08 -6.85727820e-05 -3.48323247e-04]
Adjoint gradient:  [-2.12350247e-08  1.00050055e-08  8.24053750e-08]

Without do_averaging=False:

Finite-difference gradient:  [ 1.89503592e-07 -6.86242160e-05 -3.46535295e-04]
Adjoint gradient:  [-2.12350069e-08  1.00049975e-08  8.24053022e-08]

mawc2019 commented 2 years ago

When a random structure instead of a uniform structure in the design region is used, the gradients are much larger, and the finite-difference and adjoint gradients become consistent in this test. So we may need to start the optimization from a random initial guess.

However, unlike the above test that deals with a single frequency, our waveguide mode converter deals with multiple frequencies. In this multi-frequency case, the EigenmodeCoefficient adjoint solver still gives incorrect adjoint gradients at some frequencies even if the structure is random.

mawc2019 commented 2 years ago

The issue of incorrect adjoint gradients can be temporarily avoided by using a low resolution for FDTD simulation. For example, I tried to use resolution=50 for FDTD simulation and design_resolution=100 for the material grid. However, after optimization runs a few hundred steps, meep: simulation fields are NaN or Inf often appears, which stops the optimization. This error message is accompanied by RuntimeWarning: invalid value encountered in true_divide in tanh_projection, and all design parameters become nan after this operation. However, when I tried to repeat the error by running tanh_projection separately (with the inputs that had caused trouble), this error did not appear. In addition, the denominator of tanh_projection does not seem possible to be zero. Any suggestions, @oskooi ?

oskooi commented 2 years ago

See #25 for a working demonstration involving worst-case (minimax) optimization.

NanoComp / photonics-opt-testbed

Problems in Waveguide Mode Converter #22