Closed hammy4815 closed 7 months ago
I noticed in your setup that you are gradually ramping up β with values 8, 16, 32, and 64 . Out of curiosity, what happens to the final result if you just perform a single epoch using β = 64?
I noticed in your setup that you are gradually ramping up β with values 8, 16, 32, and 64 . Out of curiosity, what happens to the final result if you just perform a single epoch using β = 64?
In my setup I use β = [8.0, 16.0, 32.0, ∞]
I considered a few cases. When I use only a single epoch at β=∞, the geometry always seems to converge to a structure with an objective value of ~30 that looks like:
When I consider two epochs at β=[8.0, ∞], the device is similar to all four epochs but the performance is moderately degraded (g=130):
I considered a few cases. When I use only a single epoch at β=∞, the geometry always seems to converge to a structure with an objective value of ~30 that looks like
The initial design uses 0.5 everywhere in the design grid. What happens if you use several different random initial designs and run for a single epoch using β=∞?
It would be useful to determine whether we just happen to be converging to a poor local optima using a constant initial design or whether something about the subpixel smoothing itself (or the way it is set up, i.e. choice of smoothing radius) is affecting the convergence.
What happens if you use several different random initial designs and run for a single epoch using β=∞?
When I initialize any geometry with similar topology as the the target device, it converges into a shape similar to what I posted above for the shape opt. If instead I begin with a completely random material distribution, then it struggles way harder and converges into some random design with incredibly poor performance:
At this point the gradients are tiny even though it has barely learned anything so it stops. But I think this sort of makes since considering the interior of the device always has zero gradient, and the material distribution is random, so changing only the boundary and expecting it to develop something useful seems tricky?
If I try random topologies, rather than completely random distributions by filtering and projecting the initial design vector before starting the optimization, results are similar:
The initial design uses 0.5 everywhere in the design grid. What happens if you use several different random initial designs and run for a single epoch using β=∞?
@oskooi note that if β=∞, you can't optimize a design with 0.5 everywhere. The optimizer won't do anything. You need to add some noise.
Referencing our discussion in meep, which will require an update here too.
Upon updating the code to reflect the fix, I tried the shape opt again:
It now runs significantly longer than before, and achieves better performance, but it's still converging to these strange solutions (especially looking at the parameters in the lower right). It's also updating incredibly slowly which leads me to believe the size of the radius and quadrature might need further tuning with the grid to make the shape opt converge to the correct solution...
This notebook contains the subpixel smoothing code written in Julia and illustrated with the Focusing 2D example.
The focusing 2D example is run with the degrees of freedom placed on a 2D rectilinear grid. Smoothing and projection take place on this grid with the same functions found in meep. The grid is then interpolated onto the FE space with a simple bilinear interpolation.
The smoothing radius used is 10nm. A large quadrature order (10) is used in the design region to resolve the sub pixel features while keeping the rectilinear pixels small enough to fix.
The original notebook achieves an objective value of about 173 using order 2 quadrature and 155 using order 4. This notebook achieves an objective value of 168 with the rectilinear grid and subpixel smoothing formulation. The optimized device is also very similar.
(cc @smartalecH)