MolSSI / QCSchema

A Schema for Quantum Chemistry
http://molssi-qc-schema.readthedocs.io/en/latest/index.html#
BSD 3-Clause "New" or "Revised" License
96 stars 36 forks source link

Basis issue orderings #45

Open vivacebelles opened 6 years ago

vivacebelles commented 6 years ago

Overall, we'd like to address the ordering of the molecular orbitals, particularly in how the Cartesian order is formatted. The Molden format can be found here. However, there seems to a disconnect in the codes in emulating the same ordering. There is not a way to extend the handwritten code in a suitable manner. So, in order to start off the discussion, an outline of the current status is summarized here. If there is anything that's been misinterpreted or missing, please feel free to add!

While Molden and GAMESS seem to follow the convention, Andy Simmonett has pointed out some inconsistencies through the FHCK file for Psi4. Ed Valeev's libint code also contains code to support basis function ordering, with some minor inconsistencies with the Molden format.

Cartesian coordinates

Molden format (expected) Andy's work [FHCK/Psi4] Ed's work [GAMESS/libint]
S: 0 S: 0 S: 0
P: X, Y, Z P: Z, X, Y P: X, Y, Z
D: XX, YY, ZZ, XY, XZ, YZ D: XX, XY, XZ, YY, YZ, ZZ D: XX, YY, ZZ, XY, YZ, XZ
F: XXX, YYY, ZZZ, XXY, XXZ, XZZ, XYY, YZZ, YZZ, XYZ F: XXX, XXY, XXZ, XYY, XYZ, XZZ, YYY, YYZ, YZZ, ZZZ F: XXX, YYY, ZZZ, XXY, XXZ, XYY, YYZ, XZZ, YZZ, XYZ
G: XXXX, YYYY, ZZZZ, XXYY, XXXZ, XYYY, YYYZ, XZZZ, YZZZ, XXYY, XXZZ, YYZZ, XXYZ, XYYZ, XYZZ G: XXXX, XXXY, XXXZ, XXYY, XXYZ, XXZZ, XYYY, XYYZ, XYZZ, XZZZ, YYYY, YYYZ, YYZZ, YZZZ, ZZZZ G: XXXX, YYYY, ZZZZ, XXXY, XXXZ, XYYY, YYYZ, XZZZ, YZZZ, XXYY, XXZZ, YYZZ, XXYZ, XYYZ, XYZZ

A more organized layout can be seen in this screenshot: screen shot 2018-04-23 at 1 47 30 pm

Iterator example

Once we've come to terms on what the ordering is, the next effort would be developing a small code that would iterate out the desired format across all systems. Daniel Smith (@dgasmith ) wrote this:

def row_cartesian_order(L):
    idx = -1
    for i in range(L + 1):
        l = L - i
        for j in range(i + 1):
            m = i - j
            n = j
            idx += 1
            yield (idx, L, m, n)

and the iterator will print out the Cartesian orders as:

This is the order for s orbitals:
(0, 0, 0, 0)
This is the order for p orbitals:
(0, 1, 0, 0)
(1, 1, 1, 0)
(2, 1, 0, 1)
This is the order for d orbitals:
(0, 2, 0, 0)
(1, 2, 1, 0)
(2, 2, 0, 1)
(3, 2, 2, 0)
(4, 2, 1, 1)
(5, 2, 0, 2)
This is the order for f orbitals:
(0, 3, 0, 0)
(1, 3, 1, 0)
(2, 3, 0, 1)
(3, 3, 2, 0)
(4, 3, 1, 1)
(5, 3, 0, 2)
(6, 3, 3, 0)
(7, 3, 2, 1)
(8, 3, 1, 2)
(9, 3, 0, 3)
This is the order for g orbitals:
(0, 4, 0, 0)
(1, 4, 1, 0)
(2, 4, 0, 1)
(3, 4, 2, 0)
(4, 4, 1, 1)
(5, 4, 0, 2)
(6, 4, 3, 0)
(7, 4, 2, 1)
(8, 4, 1, 2)
(9, 4, 0, 3)
(10, 4, 4, 0)
(11, 4, 3, 1)
(12, 4, 2, 2)
(13, 4, 1, 3)
(14, 4, 0, 4)

Spherical coordinates

In addition to the Cartesian orbital formats, we should pin down what the standard format for the spherical coordinates should be. Currently, the Molden format has this structure for orbitals:

Molden format (expected) * NWChem
S: 0 S: 0
P: 0, +1, -1 P: -1, 0, +1
D: 0, +1, -1, +2, -2 D: -2, -1, 0, +1, +2
F: 0, +1, -1, +2, -2, +3, -3 F: -3, -2, -1, 0, +1, +2, +3
G: 0, +1, -1, +2, -2, +3, -3, +4, -4 G: -4, -3, -2, -1, 0, +1, +2, +3, +4

*Psi4 and libint share Molden's format for spherical coordinates

Edits: 04/27/2018- added in spherical coordinates for computational programs stated by @wadejong and @dgasmith which includes adding NWChem to the mix. 04/23/2018- minor edits to make the table reader-friendly; linked Daniel in at iterator section.

wadejong commented 6 years ago

Adding to this, NWChem's ordering of cartesian functions is the same as FHCK/Psi4, with the exception of the p-function that follows the normal x,y,z convention other use.

For spherical, NWChem orders -l...0...l and doesn't follow Molden. I wonder what Psi4 and limit do for sphericals.

Bert

On Mon, Apr 23, 2018 at 11:59 AM, Annabelle Lolinco < notifications@github.com> wrote:

Overall, we'd like to address the ordering of the molecular orbitals, particularly in how the Cartesian order is formatted. The Molden format can be found here http://www.cmbi.ru.nl/molden/molden_format.html. However, there seems to a disconnect in the codes in emulating the same ordering. There is not a way to extend the handwritten code in a suitable manner. So, in order to start off the discussion, an outline of the current status is summarized here. If there is anything that's been misinterpreted or missing, please feel free to add!

While Molden and GAMESS seem to follow the convention, Andy Simmonett has pointed out some inconsistencies through the FHCK file for Psi4 https://github.com/psi4/psi4/blob/master/psi4/src/psi4/libmints/writer.cc#L421-L519. Ed Valeev's libint code https://github.com/evaleev/libint/blob/f678e846e4ba2b5cbd5b92e304e2168540dabeab/include/libint2/cgshellinfo.h#L75-L151 also contains code to support basis function ordering, with some minor inconsistencies with the Molden format. Cartesian coordinates

  • Currently, for s, p, d formats we should be expecting the following from the Molden format:

Molden format (expected) Andy's work [FHCK/Psi4] Ed's work [GAMESS/libint] S: 0 S: 0 S: 0 P: X, Y, Z P: Z, X, Y P: X, Y, Z D: XX YY ZZ XY XZ YZ D: XX XY XZ YY YZ ZZ D: XX YY ZZ XY YZ XZ F: XXX YYY ZZZ XXY XXZ XZZ XYY YZZ YZZ XYZ F: XXX XXY XXZ XYY XYZ XZZ YYY YYZ YZZ ZZZ F: XXX YYY ZZZ XXY XXZ XYY YYZ XZZ YZZ XYZ G: XXXX YYYY ZZZZ XXYY XXXZ XYYY YYYZ XZZZ YZZZ XXYY XXZZ YYZZ XXYZ XYYZ XYZZ G: XXXX XXXY XXXZ XXYY XXYZ XXZZ XYYY XYYZ XYZZ XZZZ YYYY YYYZ YYZZ YZZZ ZZZZ G: XXXX YYYY ZZZZ XXXY XXXZ XYYY YYYZ XZZZ YZZZ XXYY XXZZ YYZZ XXYZ XYYZ XYZZ

A more organized layout can be seen in this screenshot: [image: screen shot 2018-04-23 at 1 47 30 pm] https://user-images.githubusercontent.com/32425772/39146614-ee29bfb4-46fc-11e8-9042-e8dd39df6909.png

-

Part of the issue that can contribute to the non-standardization of the Cartesian formatting is the lexical nature of the various codes. Angular momentum is denoted as "l" in FHCK and "am" in libint's work. In Molden, you can call the Cartesian format by the number of available orbitals in the systems and the term denoting the orbital type. For example, [6D] calls all six d-orbital Cartesian functions (xx, yy, zz, xy, xz, yz)

There needs to be a discussion as well in terms of organizing the row order of how the Cartesian functions are to come out.

Iterator example

Once we've come to terms on what the ordering is, the next effort would be developing a small code that would iterate out the desired format across all systems. Daniel Smith wrote this:

def row_cartesian_order(L): idx = -1 for i in range(L + 1): l = L - i for j in range(i + 1): m = i - j n = j idx += 1 yield (idx, L, m, n)

and the iterator will print out the Cartesian orders as:

This is the order for s orbitals: (0, 0, 0, 0) This is the order for p orbitals: (0, 1, 0, 0) (1, 1, 1, 0) (2, 1, 0, 1) This is the order for d orbitals: (0, 2, 0, 0) (1, 2, 1, 0) (2, 2, 0, 1) (3, 2, 2, 0) (4, 2, 1, 1) (5, 2, 0, 2) This is the order for f orbitals: (0, 3, 0, 0) (1, 3, 1, 0) (2, 3, 0, 1) (3, 3, 2, 0) (4, 3, 1, 1) (5, 3, 0, 2) (6, 3, 3, 0) (7, 3, 2, 1) (8, 3, 1, 2) (9, 3, 0, 3) This is the order for g orbitals: (0, 4, 0, 0) (1, 4, 1, 0) (2, 4, 0, 1) (3, 4, 2, 0) (4, 4, 1, 1) (5, 4, 0, 2) (6, 4, 3, 0) (7, 4, 2, 1) (8, 4, 1, 2) (9, 4, 0, 3) (10, 4, 4, 0) (11, 4, 3, 1) (12, 4, 2, 2) (13, 4, 1, 3) (14, 4, 0, 4)

Spherical coordinates

In addition to the Cartesian orbital formats, we should pin down what the standard format for the spherical coordinates should be. Currently, the Molden format has this structure for orbitals: Molden format (expected) S: 0 P: 0, +1, -1 D: 0, +1, -1, +2, -2 F: 0, +1, -1, +2, -2, +3, -3 G: 0, +1, -1, +2, -2, +3, -3, +4, -4

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/MolSSI/QC_JSON_Schema/issues/45, or mute the thread https://github.com/notifications/unsubscribe-auth/AGa9cq13cV9JVaBXrsp50K7hntjTrY0aks5triSugaJpZM4TgcZY .

dgasmith commented 6 years ago

Psi4 uses the Molden format as shown for spherical, looks like libint does as well.

langner commented 6 years ago

Here is the cclib perspective on this topic. Ordering and naming of orbitals has been a major pain point, due to differences between programs and ambiguity. Our approach has been to parse the numbers and names separately and just expose them, sidestepping any interpretation or validation.

tovrstra commented 5 years ago

This is part of the problem discussed in #12. Not only ordering needs to be agreed upon, in some codes sign conventions also differ, alas.

tovrstra commented 5 years ago

Also worth noting is that, despite the documentation of the ordering in the Molden format, several programs (including old versions of PSI4) did not follow these conventions. We've implemented a hacky solution for this, to detect which of the known variations is being used by a file, just to demonstrate how unpleasant it can get:

https://github.com/theochem/iodata/blob/master/iodata/molden.py#L510