sometimes very wrong total number of electrons in cube file

TermeHansen commented 5 years ago

I use the Density cube file for further calculations, and in that further calculations the cube file sanity is checker by summing op the total number of electron based on the densitygrid. Sometimes I get very wrong results, and it is not just a matter of getting higher grid density.

I have added a small extra calculation of the same kind in csg.cc in psi, to check if it was psi4 side or in the calculation in the other program I use. They agree, so it is psi4 side.

    double sum_of_elecs = 0;
    for (size_t ind = 0; ind < npoints_; ind++) {sum_of_elecs +=v[ind];}
    double total_elecs = sum_of_elecs*D_[0]*D_[1]*D_[2];

example file:

molecule mol {
 0 1
   C           -0.000000000000     0.000000000000    -1.285087454295
    N           -0.000000000000     1.193800518377    -0.689363570963
    N           -0.000000000000     0.000000000000     1.378195289586
    C            0.000000000000    -1.112779051192     0.642308628677
    C           -0.000000000000     1.112779051192     0.642308628677
    N            0.000000000000    -1.193800518377    -0.689363570963
    CL           0.000000000000    -2.604296831956     1.503991149589
    CL          -0.000000000000     2.604296831956     1.503991149589
    CL          -0.000000000000     0.000000000000    -3.007607968024
 }
mol.update_geometry()
mol.symmetrize(0.01)

set reference rks
set scf_type df
set basis def2-SVP

E, wfn = energy('pbe',return_wfn=True)

set cubeprop_tasks ['density']
set CUBIC_GRID_OVERAGE [8.0, 8.0, 8.0]
set CUBIC_GRID_SPACING [0.12, 0.12, 0.12]
cubeprop(wfn)

give this in cube header:

Sum of electrons: 58527.1 number of electrons: 101.135

from the chargemol program

nvalence = 90.0000 pixelvolume = 1.7280E-03 numerically integrated valence density = 1.0113E+02 sum_valence_occupancy_correction = 0.0000E+00 checkme = 1.1135E+01 The electrons are not properly accounted for. Either the grid in your electron density input file is too coarse, you have specified the incorrect net charge in the chargemol_job.m file, or t$ Program will terminate

Further testing by changing on the grid spacing (worst case here above):

grid:0.2, error:5.2489   
grid:0.19, error:4.5155  
grid:0.18, error:2.4477  
grid:0.17, error:3.6323  
grid:0.16, error:2.3993  
grid:0.15, error:2.8355  

grid:0.14, error:2.4962  
grid:0.138, error:4.6381
grid:0.136, error:0.8501
grid:0.134, error:1.4051
grid:0.132, error:1.2419
grid:0.13, error:0.58084
grid:0.128, error:1.5786
grid:0.126, error:1.0912
grid:0.124, error:0.25624
grid:0.122, error:4.2158
grid:0.12, error:11.538
grid:0.118, error:3.1169
grid:0.116, error:1.4544
grid:0.114, error:1.4049
grid:0.112, error:1.502
grid:0.11, error:0.29599
grid:0.108, error:1.1679
grid:0.106, error:0.73735
grid:0.104, error:1.9971
grid:0.102, error:0.6448
grid:0.1, error:0.92615

grid:0.09, error:1.1189
grid:0.08, error:0.3121
grid:0.07, error:0.2545
grid:0.06, error:0.20924
grid:0.05, error:0.047781

I guess the interpolation of the density values for each point take place in the add_density function but I am not sure how I can minimize this behaviour to be sure get a more correct total electron density also with larger grid spacing, as a grid spacing of 0.05 gives VERY large files.

Regards

hokru commented 5 years ago

I keep hearing about this "problem" also elsewhere, but the fact is that the regular grid for cube files is ill fitted for integrating the density of steep regions (e.g. at the nuclei). python-based integrator for testing: https://gist.github.com/hokru/71c61f5afb2e5921b5b4955fed70f5db

cubes were made for visualization not for further analysis, iirc.

Does chargemol only work with cube files?

susilehtola commented 5 years ago

What kind of calculations are you talking about?

Your above grid convergence test tentatively shows that this really is a grid issue. The error becomes smaller for smaller grid spacings, but there are oscillations due to the egg-box effect: nuclei don't line up exactly with the grid points.

Chlorine is already pretty heavy, and has quite tight core functions. Getting these right in a cartesian grid is practically untractable. A 0.05 bohr grid is still very coarse; it's what is usually used in GPAW calculations i.e. valence-only.

TermeHansen commented 5 years ago

Thinking along that it is related to something like that, so I'm also starting to make a valence electrons density cube export to psi4 to test if that helps me in reducing problems in the core of the atoms for a grid space :)

TermeHansen commented 5 years ago

Hmm... can anybody give a bit of input into how I can access the Density Matrix (Da) at the basis function level (I have tried to do it like in the add_orbital() function), or if I can calculate the electron density from the orbital matrix (Ca)?

for the second option I guess I have to do something along this code in points.cc:

    // Rho_a = 2.0 * D_xy phi_xa phi_ya
    C_DGEMM('N', 'N', npoints, nlocal, nlocal, 2.0, phip[0], coll_funcs, D2p[0], nglobal, 0.0, Tp[0], nglobal);
    for (int P = 0; P < npoints; P++) {
        rhoap[P] = C_DDOT(nlocal, phip[P], 1, Tp[P], 1);
}

Regards

PeterKraus commented 5 years ago

In the "DUAL_DESCRIPTOR" keyword I implemented, I calculated the two densities of the frontier orbitals manually, here:

https://github.com/psi4/psi4/blob/cb341495898ff07b54cbe0f076ec5ccfe4f3918f/psi4/src/psi4/libcubeprop/csg.cc#L653

So that's one, tedious, option. I'd be interested if the orbital densities are computed elsewhere myself.

TermeHansen commented 5 years ago

Njae, your solution is true if there is one electron for each basis function. So follow up to that would be where to find the fraction of electrons per basis function?

PeterKraus commented 5 years ago

This might help: https://github.com/psi4/psi4/blob/cb341495898ff07b54cbe0f076ec5ccfe4f3918f/psi4/src/psi4/libcubeprop/cubeprop.cc#L74

andysim commented 5 years ago

Sorry, I only just saw this thread. I don't know if this is useful, but here's some code that computes a DFT wavefunction as a simple way to set up a DFT integration grid. It then uses that integration grid to integrate the electron density at the HF level in this example. Most of this was stolen from @dgasmith

import psi4
import numpy as np
np.set_printoptions(suppress=True, precision=4, linewidth=150)

mol = psi4.geometry("""
    O   0.000000000000     0.000000000000    -0.071143036192
    H   0.000000000000    -0.758215806856     0.564545805801
    H   0.000000000000     0.758215806856     0.564545805801
    symmetry c1
""")

psi4.set_options({"BASIS": "sto-3g",
                  "DFT_BLOCK_MAX_POINTS": 2048,
                  "DFT_BASIS_TOLERANCE": 1.e-10})

method = "HF"
e, wfn = psi4.energy(method, return_wfn=True)
Da = np.array(wfn.Da())

# Vpot builder
_, wfn_V = psi4.energy("SVWN", return_wfn=True, molecule=mol)
Vpot = wfn_V.V_potential()

points_func = Vpot.properties()[0]

# Loop over the blocks
nelectrons = 0.0
for b in range(Vpot.nblocks()):

    # Metadata
    block = Vpot.get_block(b)
    npoints = block.npoints()
    lpos = np.array(block.functions_local_to_global())

    # Obtain the grid weight
    w = np.array(block.w())

    # Compute phi!
    points_func.compute_points(block)
    phi = np.array(points_func.basis_values()["PHI"])[:npoints, :lpos.shape[0]]

    # Build a local density
    localDa = wfn.Da().np[(lpos[:, None], lpos)]

    # Copmute rho
    rho = 2.0 * np.einsum('pm,mn,pn->p', phi, localDa, phi, optimize=True)
    nelectrons += np.dot(w, rho)
print(nelectrons)

TermeHansen commented 5 years ago

Think I got it now, will clean up and make a PR at some point soon

TermeHansen commented 5 years ago

hmmm... thought I got it, but not sure how the order of indices should be for the basis functions, if this holds... did this before the add_density with D2 matrix instead. As far as I see in the points.cc this should work to calculate the density only using the passed basis functions in the D2 matrix. So how to select the right basis functions?

    auto D2 = std::make_shared<Matrix>("D2", primary_->nbf(), primary_->nbf());

    double** Dp = D->pointer();
    double** D2p = D2->pointer();

    for (int ml = 0; ml < indices.size(); ml++) {
        int mg = indices[ml];
        for (int nl = 0; nl <= ml; nl++) {
            int ng = indices[nl];
            double Dval = Dp[mg][ng];
            D2p[ml][nl] = Dval;
            D2p[nl][ml] = Dval;
        }
    }

dgasmith commented 5 years ago

The indices for basis functions are the order of shells offset by the size of each shell. So for example if I had SPP the indices would be (0, 1), (1, 4), (4, 7).

import psi4

mol = psi4.geometry("He")

basis = psi4.core.BasisSet.build(mol, target="cc-pVTZ")
offset = 0
for x in range(basis.nshell()):
    shell = basis.shell(x)
    nfunc = shell.nfunction
    print("Basis {}, AM = {}, indices  = ({}, {})".format(x, shell.am, offset, offset + nfunc))
    offset += nfunc

Basis 0, AM = 0, indices  = (0, 1)
Basis 1, AM = 0, indices  = (1, 2)
Basis 2, AM = 0, indices  = (2, 3)
Basis 3, AM = 1, indices  = (3, 6)
Basis 4, AM = 1, indices  = (6, 9)
Basis 5, AM = 2, indices  = (9, 14)

TermeHansen commented 5 years ago

Then there is something that is not done correct with the new passed D2 matrix to points..

molecule mol {
 0 1
O      0.000000     1.031228     1.125381
H      0.000000     1.785070     0.512177
H      0.000000     0.253722     0.542498
 }
set scf_type df
set basis cc-pVTZ

E, wfn = energy('pbe',return_wfn=True)
bas = wfn.basisset()
for x in range(bas.nshell()):
    shell = bas.shell(x)
    nfunc = shell.nfunction
    fi = shell.function_index
    ce = shell.ncenter
    print("Basis {}, AM = {}, center {}, indices  = {}".format(x, shell.am, ce, range(fi, fi + nfunc)))

prints:

Basis 0, AM = 0, center 0, indices  = [0]
Basis 1, AM = 0, center 0, indices  = [1]
Basis 2, AM = 0, center 0, indices  = [2]
Basis 3, AM = 0, center 0, indices  = [3]
Basis 4, AM = 1, center 0, indices  = [4, 5, 6]
Basis 5, AM = 1, center 0, indices  = [7, 8, 9]
Basis 6, AM = 1, center 0, indices  = [10, 11, 12]
Basis 7, AM = 2, center 0, indices  = [13, 14, 15, 16, 17]
Basis 8, AM = 2, center 0, indices  = [18, 19, 20, 21, 22]
Basis 9, AM = 3, center 0, indices  = [23, 24, 25, 26, 27, 28, 29]
Basis 10, AM = 0, center 1, indices  = [30]
Basis 11, AM = 0, center 1, indices  = [31]
Basis 12, AM = 0, center 1, indices  = [32]
Basis 13, AM = 1, center 1, indices  = [33, 34, 35]
Basis 14, AM = 1, center 1, indices  = [36, 37, 38]
Basis 15, AM = 2, center 1, indices  = [39, 40, 41, 42, 43]
Basis 16, AM = 0, center 2, indices  = [44]
Basis 17, AM = 0, center 2, indices  = [45]
Basis 18, AM = 0, center 2, indices  = [46]
Basis 19, AM = 1, center 2, indices  = [47, 48, 49]
Basis 20, AM = 1, center 2, indices  = [50, 51, 52]
Basis 21, AM = 2, center 2, indices  = [53, 54, 55, 56, 57]

if I use all basis functions in indices:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 
number of electrons: 9.93018

only Hydrogens:

30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 
number of electrons: 0.849095

only Oxygen;

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 
number of electrons: 7.62676

dgasmith commented 5 years ago

My guess is that you are only allowing contributions when both basis functions are on the same center which removes the density where basis functions are on different centers.

import numpy as np
np.set_printoptions(suppress=True, precision=4, linewidth=150)

mol = psi4.geometry("""
    O   0.000000000000     0.000000000000    -0.071143036192
    H   0.000000000000    -0.758215806856     0.564545805801
    H   0.000000000000     0.758215806856     0.564545805801
    symmetry c1
""")

psi4.set_options({"BASIS": "cc-pVDZ",
                  "DFT_BLOCK_MAX_POINTS": 2048,
                  "DFT_BASIS_TOLERANCE": 1.e-14})

method = "PBE"
e, wfn = psi4.energy(method, return_wfn=True)
Da = np.array(wfn.Da())
Ca = np.array(wfn.Ca())

# Grab objects
basis = wfn.basisset()
Vpot = wfn.V_potential()
points_func = Vpot.properties()[0]

center = 2

for center in range(3):
    # Loop over the blocks
    nelectrons = 0.0
    for b in range(Vpot.nblocks()):

        # Metadata
        block = Vpot.get_block(b)
        npoints = block.npoints()
        lpos = np.array(block.functions_local_to_global())
        nlocal_basis = lpos.shape[0]

        oncenter = np.array([basis.function_to_center(x) == center for x in lpos])
        lpos = lpos[oncenter]

        # Obtain the grid weight
        w = np.array(block.w())

        # Compute phi!
        points_func.compute_points(block)
        phi = np.array(points_func.basis_values()["PHI"])[:npoints, :nlocal_basis]
        phi = phi[:, oncenter]

        # Build a local density
        localDa = Da[(lpos[:, None], lpos)]

        # Copmute rho
        rho = 2.0 * np.einsum('pm,mn,pn->p', phi, localDa, phi, optimize=True)
        nelectrons += np.dot(w, rho)
    print(center, nelectrons)

0 7.59001168019781
1 0.5717653447775689
2 0.5717641193430182

TermeHansen commented 5 years ago

I don't fully follow your lead. Your example adds up to 8.73 electrons, which is still short from the total of 9.93 (~10).

So coming back to the aim of this work. How do I selectively remove core electrons part in the density calculation - in this water example we could say the two inner electrons. But a better example could maybe be Methyl bromide :

C          0.17522       -0.00000       -0.00002
Br        -1.79078        0.00000       -0.00002
H          0.53852        0.00000        1.02768
H          0.53852       -0.89000       -0.51382
H          0.53852        0.89000       -0.51382

and remove the 10 core electrons of Bromide only.

dgasmith commented 5 years ago

A single basis function only covers a single center, but an element of a density matrix is made up of two basis functions which can exist on separate centers.

You may need to define a core density more, but usually we think of the "core" as molecular orbitals (not basis functions) that do not belong in the valence. In general you can remove the n lowest eigenvalues from a density as follows:

Da = np.array(wfn.Da())
Ca = np.array(wfn.Ca())

ncore = 5
D_valence = Da - np.einsum('pi,qi->pq', Ca[:, : ncore], Ca[:, : ncore])

# psi4.core.Matrix.from_array(D_valence) # Convert back to a Psi4 matrix object.

The lowest energy eigenvalues are typically the S densities close to a nuclei. From the above molecule we see an eigenvalue spectrum:

    Doubly Occupied:

       1A   -481.016510     2A    -61.827320     3A    -55.742925
       4A    -55.738646     5A    -55.738646     6A     -9.947341
       7A     -8.413043     8A     -6.292052     9A     -6.277230
      10A     -6.277230    11A     -2.489635    12A     -2.485739

Where 1A/2A are the 1S/2S core molecular orbitals and 3A-5A is the first P shell. I would not guess the others without plotting.

andysim commented 5 years ago

Sorry, I'm hurrying here and have not carefully read the above comments / code snippets. It sounds like you want to remove core-like MOs: the easy way to do this is just define a new density matrix:

D = np.einsum('mi,ni->mn', Ca[:,n:ndocc], Ca[:,ndocc])

where Ca comes from Wfn->Ca()

andysim commented 5 years ago

Hah, it looks like DGAS said basically the same thing at the same time. In my example I forgot to mention that n is the first valence MO and ndocc is the number of double occupied orbitals. That new density can the be used in the normal way.

TermeHansen commented 5 years ago

Thank you very much guys, I will try this out, but as I need this for a cubefile I need to do this cpp side (I guess), can you also help in how to make the D_val cpp side?

dgasmith commented 5 years ago

nremove = 5

Da = np.array(wfn.Da())
Ca = np.array(wfn.Ca())

D = Da - np.einsum('pi,qi->pq', Ca[:, :nremove], Ca[:, :nremove])
D = psi4.core.Matrix.from_array(D)

wfn.Da().copy(D)

psi4.set_options({"CUBEPROP_TASKS": ["density"]})
cube = psi4.core.CubeProperties(wfn)
cube.compute_properties()

Please note we destroy the wavefunction here, it will not be usable for other purposes.

This is with Psi4 1.2, there are changes post in the current master and in 1.3 which will change this I believe. This also will not currently work with symmetry.

TermeHansen commented 5 years ago

Had to change to:

nremove = 5
Da = np.array(wfn.Da_subset('AO'))
Ca = np.array(wfn.Ca_subset("AO", "ALL"))

D = Da - np.einsum('pi,qi->pq', Ca[:, :nremove], Ca[:, :nremove])
D = psi4.core.Matrix.from_array(D)

wfn.Da_subset('AO').copy(D)

psi4.set_options({"CUBEPROP_TASKS": ["density"]})
cube = psi4.core.CubeProperties(wfn)
cube.compute_properties()

to get it to work, but no mater how I set the remove value I can not see any change in the total number of electrons in the cubefile...

dgasmith commented 5 years ago

Subset will clone the array so your are not editing internal wavefunction values. It looks like you made your change to accommodate a wave function with symmetry. As noted this will not work. While possible, it will be a reasonable amount of additional work.

I should note this operation is not exactly supported or in most cases a recommended thing anyone should do. This may break in the future, but gives something we can quickly test.

TermeHansen commented 5 years ago

Great - and yes mybad with the symmetry as you correctly noted. Methyl bromide without removing lovest MO: number of electrons: 42.6339 after removing the 5: number of electrons: 34.0006 Great!

dgasmith commented 5 years ago

Great! One thing to check is that it might only be the lowest 1/2 eigenvalues that are giving you issues.

TermeHansen commented 5 years ago

One thing that puzzles me right now is how to figure out center(s) involved in the MO indices in epsilon_a().

hokru commented 5 years ago

all nice exercises...but as chargemol can take wfx files I would first try to make such files from the psi4 molden files using https://github.com/zorkzou/Molden2AIM. Then you don't have to abuse cube files for data transfer.

Or is this something else and not chargemol?

TermeHansen commented 5 years ago

Already tried that and I recall having issues in the molden2aim process also in some cases.

susilehtola commented 5 years ago

One thing that puzzles me right now is how to figure out center(s) involved in the MO indices in epsilon_a().

Core orbitals are delocalized between identical nuclei. To figure out which nuclei a core orbital corresponds to, you would have to have a suitable metric. This could be the square norm of the MO coefficient of predefined core functions. A more accessible way would be to first localize the orbitals, so that you get rid of the mixing between nuclei, and then it would be straightforward to calculate e.g. <r^2>.

If you are asking how to do this by hand in a small system, and not in a general case, then that's pretty straightforward. Just look at the orbital coefficients. If you're using a contracted basis set, the contracted functions should be good approximations to the core orbitals. Thus, you should see a coefficient close to 1 in absolute value for the core function in the corresponding core orbital, if you only have one heavy atom. If you have more, then the amplitude will be divided among the nuclei.

dgasmith commented 5 years ago

@TermeHansen Any action items that you would like to come out of this thread?

TermeHansen commented 5 years ago

Well I still miss a good general way to figure out which centers the different MO indices belong to, but if that is not possible then I have to live with as is, the solution provided by you and others work as needed at least...

susilehtola commented 5 years ago

MOs are delocalized over the nuclei. To identify MOs with individual nuclei, you have to localize them somehow first. And even them only the core orbitals can be localized..

Susi Lehtola Sent from my phone so excuse my brevity.

On 16 Dec 2018, at 16.42, Rasmus Lundsgaard notifications@github.com<mailto:notifications@github.com> wrote:

Well I still miss a good general way to figure out which centers the different MO indices belong to, but if that is not possible then I have to live with as is, the solution provided by you and others work as needed at least...

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/psi4/psi4/issues/1311#issuecomment-447648418, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AArOJ7CNuFLjRwZfWy4Vse5kqu6N0Syxks5u5luqgaJpZM4X6Xyy.

dgasmith commented 4 years ago

Core issue is solved, please open new ones if additional functionality is requested.

psi4 / psi4

sometimes very wrong total number of electrons in cube file #1311