kthpanor / vlxman

8 stars 3 forks source link

Properly loading .h5 file into Jupyter notebook #3

Closed giammi56 closed 1 year ago

giammi56 commented 1 year ago

I am running some UV/vis calculations using veloxchem on a cluster and collecting the .h5 files. In particular, I am interested in the absorption properties of a molecule in a specific frequency range:

@response
property: absorption (cpp)
! frequency region (and resolution)
frequencies: 0.05-0.15 (0.0025)
damping: 0.0045563  ! this is the default value
@end

I am interested in using this example from echem as a quick way of displaying my data, but I'd like to "load" the calculated .h5 file. Is there a function to properly load a .h5 file instead of calculating it within a Jupyter notebook? If not, could you suggest a clever way to code it?

I am trying to extract the dipoles on the diagonal from the test.rsp.solutions.h5 file using a rudimental approach using h5py.File("test.rsp.solutions.h5", "r"), and storing in np arrays keys matching a specific patter. Obviously, the resulting structure of the array doesn't fit well with the example I'd like to follow.

Hopefully, it is not a totally silly question. Thank you!

kthpanor commented 1 year ago

Try:

>>> import h5py
>>> hf = h5py.File('cpp.rsp.solutions.h5', 'r')
>>> print(list(hf.keys()))
['atom_coordinates', 'basis_set', 'dft_func_label', 'nuclear_charges', 'nuclear_repulsion', 'potfile_text', 'x_x_0.00000000', 'x_x_0.05000000', 'x_x_0.10000000', 'x_x_0.15000000', 'x_x_0.20000000', 'x_x_0.25000000', 'x_y_0.00000000', 'x_y_0.05000000', 'x_y_0.10000000', 'x_y_0.15000000', 'x_y_0.20000000', 'x_y_0.25000000', 'x_z_0.00000000', 'x_z_0.05000000', 'x_z_0.10000000', 'x_z_0.15000000', 'x_z_0.20000000', 'x_z_0.25000000', 'y_x_0.00000000', 'y_x_0.05000000', 'y_x_0.10000000', 'y_x_0.15000000', 'y_x_0.20000000', 'y_x_0.25000000', 'y_y_0.00000000', 'y_y_0.05000000', 'y_y_0.10000000', 'y_y_0.15000000', 'y_y_0.20000000', 'y_y_0.25000000', 'y_z_0.00000000', 'y_z_0.05000000', 'y_z_0.10000000', 'y_z_0.15000000', 'y_z_0.20000000', 'y_z_0.25000000', 'z_x_0.00000000', 'z_x_0.05000000', 'z_x_0.10000000', 'z_x_0.15000000', 'z_x_0.20000000', 'z_x_0.25000000', 'z_y_0.00000000', 'z_y_0.05000000', 'z_y_0.10000000', 'z_y_0.15000000', 'z_y_0.20000000', 'z_y_0.25000000', 'z_z_0.00000000', 'z_z_0.05000000', 'z_z_0.10000000', 'z_z_0.15000000', 'z_z_0.20000000', 'z_z_0.25000000']
>>> import numpy as np
>>> rsp_vec_x_x_0 = np.array(hf.get('x_x_0.00000000'))
>>> print(rsp_vec_x_x_0.shape)
(360,)
giammi56 commented 1 year ago

Hi, thank you for your suggestion. Yes, it works to read the .h5 file as usual, but I am wondering why rsp_vec_x_x_0 is composed of 360 complex numbers

array([-3.18449874e-07+7.62493760e-08j, -3.01617664e-08+1.36178798e-07j,
        1.96107965e-07-2.99741630e-07j, ...,
       -8.88158191e-04-1.69609116e-04j,  7.82317716e-04+2.20814510e-04j,
       -4.52509313e-04+1.38462321e-04j])

and not a single one for each frequency as in the echem example. In the echem example:

cpp_prop = LinearAbsorptionCrossSection(
    {"frequencies": ",".join(freqs_str), "damping": 0.3 / au2ev}, method_settings
)
cpp_prop.init_driver(comm, ostream=silent_ostream)
cpp_prop.compute(molecule, basis, scf_drv.scf_tensors)

# Extract the imaginary part of the complex response function and convert to absorption cross section
sigma = []
for w in freqs:
    axx = -cpp_prop.rsp_property["response_functions"][("x", "x", w)].imag
    ayy = -cpp_prop.rsp_property["response_functions"][("y", "y", w)].imag
    azz = -cpp_prop.rsp_property["response_functions"][("z", "z", w)].imag
    alpha_bar = (axx + ayy + azz) / 3.0
    sigma.append(4.0 * np.pi * w * alpha_bar / 137.035999)

the content of cpp_prop.rsp_property["response_functions"]is:

('x', 'x', 0.25724525755505434): (-8.937010271963658-0.2192567148954479j),
 ('y',  'x',  0.25724525755505434): (-2.0616204730690196e-15-9.002530500035028e-16j),
 ('z',  'x',  0.25724525755505434): (6.1515303516845225e-15+9.515940880604661e-16j),
...

thus a single complex number per frequency. I couldn't find exactly how each i_j_frequncy is calculated, but I can imagine an integration step is missing to be consistent with the echem example. I would appreciate a hint from you on how to proceed. Thank you!

kthpanor commented 1 year ago

These are the response vectors. The scalar response function values are obtained as inner products between property gradient vectors and response vectors as illustrated on page

https://kthpanor.github.io/echem/docs/spec_prop/tddft_rpa.html#electric-dipole-polarizability

for conventional, real-values, response theory.

giammi56 commented 1 year ago

Thank you, the link explains it perfectly. Giving the fact that I can not run echem on my system (arm64), with which equivalent response property can I calculate the property gradient vectors?

kthpanor commented 1 year ago

It is explained on the same page

https://kthpanor.github.io/echem/docs/spec_prop/tddft_rpa.html#v-1-gradient

and as you can see the elements of the gradient vector are the elements of the ov-block of the corresponding operator in MO representation.

giammi56 commented 1 year ago

I wasn't able to explain myself correctly, I apologize. The page you referred to is a great explanation, but it is based on Jupyter notebook implementation. Although I have just saw it could be somehow adapted to be run with an arm64-friendly vlxenv environment, I am looking for a solution to be used without IDE.

Following this guide, I run successfully the following recipe with a large molecule evaluated on 40 frequencies:

@jobs
task: response
@end

@method settings
xcfun: b3lyp
basis: def2-svp
@end

@response
property: absorption (cpp)
! frequency region (and resolution)
frequencies: 0.05-0.15 (0.0025)
damping: 0.0045563  ! this is the default value
@end

@molecule
charge: 0
multiplicity: 1
units: au
xyz:  
...
@end 

It outputs 5 files:

my_job.out
my_job.rsp.h5
my_job.rsp.solutions.h5
my_job.scf.h5
my_job.scf.tensors.h5

Sigma values are already written at the end of the .out file. As just said, the .rsp.solutions.h5 file has the response vectors in the form i_j_freq. As pointed out in the .out file, the property integrals are stored in .rsp.h5. The latter file has the following keys: Keys: <KeysViewHDF5 ['CLR_bger_half_size', 'CLR_bung_half_size', 'CLR_e2bger_half_size', 'CLR_e2bung_half_size', 'atom_coordinates', 'basis_set', 'dft_func_label', 'nuclear_charges', 'nuclear_repulsion', 'potfile_text']> Assuming that the latter contains indeed the right property integrals in the MO-basis, what are the steps to adapt them in a form suitable to the response vectors in order to retrieve the scalar response function values? Thank you for your patience!

kthpanor commented 1 year ago

This forum is not suitable for this sort of guidance. Send a email directly to me and I try to offer a Zoom meeting with suitable participants. Best, Patrick Norman panor@kth.se