todo: load and dump structure for dispersive materials

oskooi commented 6 years ago

There is a bug in the load and dump structure routines for the case of dispersive materials (the non-dispersive case is fine). This is demonstrated in the following script which is a 2d simulation with a block of aluminum (taken from the materials library) and a point dipole source at the center of the cell. The Ez field is output at a random point in the computational cell at every 5 time units during the time stepping.

The simulation consists of two separate runs: (1) the structure is defined using a geometry object and the structure is saved to an HDF5 file via dump_structure at the end of the time stepping, and (2) the structure is defined using load_structure. The fields from both runs should be nearly identical but they are not as shown in the output which follows the script.

Script

import meep as mp

# default unit length is 1 um                                                                                                                        
um_scale = 1.0

# conversion factor for eV to 1/um [=1/hc]                                                                                                           
eV_um_scale = um_scale/1.23984193

metal_range = mp.FreqRange(min=um_scale/12.4, max=um_scale/0.2)

Al_plasma_frq = 14.98*eV_um_scale
Al_f0 = 0.523
Al_frq0 = 1e-10
Al_gam0 = 0.047*eV_um_scale
Al_sig0 = Al_f0*Al_plasma_frq**2/Al_frq0**2
Al_f1 = 0.227
Al_frq1 = 0.162*eV_um_scale      # 7.654 um                                                                                                          
Al_gam1 = 0.333*eV_um_scale
Al_sig1 = Al_f1*Al_plasma_frq**2/Al_frq1**2
Al_f2 = 0.050
Al_frq2 = 1.544*eV_um_scale      # 0.803 um                                                                                                          
Al_gam2 = 0.312*eV_um_scale
Al_sig2 = Al_f2*Al_plasma_frq**2/Al_frq2**2
Al_f3 = 0.166
Al_frq3 = 1.808*eV_um_scale      # 0.686 um                                                                                                          
Al_gam3 = 1.351*eV_um_scale
Al_sig3 = Al_f3*Al_plasma_frq**2/Al_frq3**2
Al_f4 = 0.030
Al_frq4 = 3.473*eV_um_scale      # 0.357 um                                                                                                          
Al_gam4 = 3.382*eV_um_scale
Al_sig4 = Al_f4*Al_plasma_frq**2/Al_frq4**2

Al_susc = [mp.DrudeSusceptibility(frequency=Al_frq0, gamma=Al_gam0, sigma=Al_sig0),
           mp.LorentzianSusceptibility(frequency=Al_frq1, gamma=Al_gam1, sigma=Al_sig1),
           mp.LorentzianSusceptibility(frequency=Al_frq2, gamma=Al_gam2, sigma=Al_sig2),
           mp.LorentzianSusceptibility(frequency=Al_frq3, gamma=Al_gam3, sigma=Al_sig3),
           mp.LorentzianSusceptibility(frequency=Al_frq4, gamma=Al_gam4, sigma=Al_sig4)]

Al = mp.Medium(epsilon=1.0, E_susceptibilities=Al_susc, valid_freq_range=metal_range)

resolution = 50                  # pixels/um                                                                                                         

cell_size = mp.Vector3(5,5,0)

geometry = [mp.Block(material=Al, size=mp.Vector3(1,1,mp.inf))]

sources = [mp.Source(mp.GaussianSource(1, fwidth=0.2), component=mp.Ez, center=mp.Vector3())]

sim = mp.Simulation(resolution=resolution,
                    cell_size=cell_size,
                    geometry=geometry,
                    sources=sources)

def print_stuff(sim):
    p = sim.get_field_point(mp.Ez, mp.Vector3(0.12,-0.29))
    print("field:, {}, {:.8f}".format(sim.round_time(),p.real))

sim.run(mp.at_every(5, print_stuff), until=50)
sim.dump_structure('geometry.h5')

sim.reset_meep()

sim = mp.Simulation(resolution=resolution,
                    cell_size=cell_size,
                    load_structure='geometry.h5',
                    sources=sources)

sim.run(mp.at_every(5, print_stuff), until=50)

Output

Using MPI version 3.0, 1 processes
-----------
Initializing structure...
Working in 2D dimensions.
Computational cell is 5 x 5 x 0 with resolution 50
     block, center = (0,0,0)
          size (1,1,1e+20)
          axes (1,0,0), (0,1,0), (0,0,1)
          dielectric constant epsilon diagonal = (1,1,1)
time for set_epsilon = 0.0525229 s
lorentzian susceptibility: frequency=2.80116, gamma=2.72777
lorentzian susceptibility: frequency=1.45825, gamma=1.08966
lorentzian susceptibility: frequency=1.24532, gamma=0.251645
lorentzian susceptibility: frequency=0.130662, gamma=0.268583
drude susceptibility: frequency=1e-10, gamma=0.0379081
-----------
field:, 5.0, -0.00000012
field:, 10.0, 0.00000004
field:, 15.0, -0.00000004
field:, 20.0, -0.00000002
field:, 25.0, -0.00000000
field:, 30.0, 0.00000001
field:, 35.0, 0.00000000
field:, 40.0, 0.00000004
field:, 45.0, 0.00000000
field:, 50.0, 0.00000004
run 0 finished at t = 50.0 (5000 timesteps)

Field time usage:
     connecting chunks: 0.00229096 s
         time stepping: 3.21455 s
         communicating: 0.0209296 s
    Fourier transforming: 0.00016284 s
       everything else: 0.0453818 s

-----------
Initializing structure...
Working in 2D dimensions.
Computational cell is 5 x 5 x 0 with resolution 50
time for set_epsilon = 0.048687 s
-----------
field:, 5.0, -0.00006323
field:, 10.0, -0.00258061
field:, 15.0, -0.03768376
field:, 20.0, -0.19705636
field:, 25.0, -0.36507856
field:, 30.0, -0.21040063
field:, 35.0, 0.06384776
field:, 40.0, 0.11168341
field:, 45.0, 0.00354841
field:, 50.0, 0.06331689
run 0 finished at t = 50.0 (5000 timesteps)

Elapsed run time = 3.9406 s

Field time usage:
     connecting chunks: 0.00113988 s
         time stepping: 0.508693 s
         communicating: 0.0168583 s
    Fourier transforming: 0.000128031 s
       everything else: 0.01789 s

Note the large discrepancy in the time spent on "time stepping" for the two runs: 3.21455 s vs. 0.508693 s. This suggests that the permittivity values defined on the grid are different.

If we switch the block's material from Al to mp.Medium(index=3.5) and rerun the simulation, the results are nearly identical as expected. The time spent on "time stepping" is also nearly identical: 0.545928 s vs. 0.544215 s.

Output

Using MPI version 3.0, 1 processes
-----------
Initializing structure...
Working in 2D dimensions.
Computational cell is 5 x 5 x 0 with resolution 50
     block, center = (0,0,0)
          size (1,1,1e+20)
          axes (1,0,0), (0,1,0), (0,0,1)
          dielectric constant epsilon diagonal = (12.25,12.25,12.25)
time for set_epsilon = 0.0524268 s
-----------
field:, 5.0, -0.00006018
field:, 10.0, -0.00242619
field:, 15.0, -0.03554589
field:, 20.0, -0.18655190
field:, 25.0, -0.33592261
field:, 30.0, -0.18451606
field:, 35.0, -0.07908586
field:, 40.0, -0.36459555
field:, 45.0, -0.62255526
field:, 50.0, -0.34560935
run 0 finished at t = 50.0 (5000 timesteps)

Field time usage:
     connecting chunks: 0.00114298 s
         time stepping: 0.545928 s
         communicating: 0.0155475 s
    Fourier transforming: 0.000156164 s
       everything else: 0.017828 s

-----------
Initializing structure...
Working in 2D dimensions.
Computational cell is 5 x 5 x 0 with resolution 50
time for set_epsilon = 0.048739 s
-----------
field:, 5.0, -0.00006018
field:, 10.0, -0.00242619
field:, 15.0, -0.03554589
field:, 20.0, -0.18655188
field:, 25.0, -0.33592254
field:, 30.0, -0.18451596
field:, 35.0, -0.07908586
field:, 40.0, -0.36459558
field:, 45.0, -0.62255508
field:, 50.0, -0.34560904
run 0 finished at t = 50.0 (5000 timesteps)

Field time usage:
     connecting chunks: 0.00114107 s
         time stepping: 0.544215 s
         communicating: 0.0155528 s
    Fourier transforming: 0.000146389 s
       everything else: 0.0180089 s

oskooi commented 6 years ago

This is not a bug but rather a feature that is not currently supported (it is stated as such in the documentation). Given that initializing structures with dispersive materials is often more time and compute intensive than non-dispersive ones, it would be good to extend this feature to support these class of materials. An example application is to model recombining excitons in organic light-emitting diodes (OLEDs) via stochastic dipole sources.

stevengj commented 6 years ago

This is do-able. I think you would need to:

Add dump and load methods to the susceptibility class and subclasses thereof that take an open h5file object and a dataset prefix, and write out the susceptibility.sigma data to a prefix_sigma and prefix_numsigma in exactly the same way that structure::dump outputs chi1inv and num_chi1inv now. Also output a prefix_params data set that contains the other parameters needed to reconstruct the susceptibility, like the omega_0 and gamma parameters of a lorenzian_susceptibility.
- the sigma array is structured in exactly the same way as the chi1inv array, with gv.ntot() elements per non-NULL component.
Have structure::dump loop through all of its susceptibilities and call their dump methods, with a unique prefix argument for each. structure::load should do the inverse.

stevengj commented 6 years ago

Parallelization: the key thing to realize is that, in parallel HDF5, creating the dataset has to be done once, by all processes together. Once this is done, individual processes can write non-overlapping portions of the data in parallel.

One way to do this here would be:

First loop over all of the is_mine chunks (the chunks on the current process), and all of the susceptibilities of the chunks, and add up the total number of points my_ntot.
Then, similar to structure::dump, compute my_start and ntotal by summing over all processes. Use this to create a dataset of size ntotal
Then loop over the is_mine chunks again and write the susceptibility data to the dataset at offset my_start.

NanoComp / meep

todo: load and dump structure for dispersive materials #418