nwchemgit / nwchem

NWChem: Open Source High-Performance Computational Chemistry
http://nwchemgit.github.io
Other
502 stars 160 forks source link

Number of functions in 3-21G* basis set #747

Closed MarceloM-lo closed 1 year ago

MarceloM-lo commented 1 year ago

When using the 3-21G* basis for an HF-SCF calculation for water, NWChem uses 13 basis functions and therefore obtains 13 MOs. Trouble is, there should be 18 or 19 functions (depending on whether the angular functions are spherical or Cartesian).

NWChem 7.0.2

Steps to show that the error is there when 3-21G* is used:

  1. 3-21G basis set for HF-SCF geometry optimization of water, output file "H2O-321G.txt" (input echoed): 13 basis functions and MOs, see lines 367 and 390-393.

  2. 3-21G* basis set (with cartesian angular functions) for HF-SCF geometry optimization of water, output file "H2O-321Gx-cart.txt" (input echoed): 13 basis functions and MOs, see lines 367 and 390-393. THERE SHOULD BE 19 FUNCTIONS AND 19 MOs.

  3. 3-21G* basis set (with spherical angular functions) for HF-SCF geometry optimization of water, output file "H2O-321Gx-sph.txt" (input echoed): 13 basis functions and MOs, see lines 367 and 390-393. THERE SHOULD BE 18 FUNCTIONS AND 18 MOs.

Steps to show that, when 6-31G* is used instead, the behaviour is as expected:

  1. 6-31G basis set for HF-SCF geometry optimization of water, output file "H2O-631G.txt" (input echoed): 13 basis functions and MOs, see lines 373 and 396-399. THIS IS CORRECT, AS WAS THE CASE FOR 3-21G.

  2. 6-31G basis set (with cartesian angular functions) for HF-SCF geometry optimization of water, output file "H2O-631Gx-cart.txt" (input echoed): 19 basis functions and MOs, see lines 319 and 342-345. THIS IS CORRECT, UNLIKE THE CASE FOR 3-21G.

  3. 3-21G basis set (with spherical angular functions) for HF-SCF geometry optimization of water, output file "H2O-631Gx-sph.txt" (input echoed): 18 basis functions and MOs, see lines 286 and 309-312. THIS IS CORRECT, UNLIKE THE CASE FOR 3-21G.

H2O-321G.txt

H2O-321Gx-cart.txt

H2O-321Gx-sph.txt

H2O-631G.txt

H2O-631Gx-cart.txt

H2O-631Gx-sph.txt

jeffhammond commented 1 year ago

There is no 3-21G in https://www.basissetexchange.org/ and the versions of 3-21G and 3-21G in NWChem basis set library are identical for H and O, so your output files are exactly what I'd expect.

Are you trying to say that the 3-21G* basis set we use (https://github.com/nwchemgit/nwchem/blob/master/src/basis/libraries/3-21gs) is incorrect?

If so, I recommend that you instead contribute 3-21G* to https://www.basissetexchange.org/ via https://github.com/MolSSI-BSE and then NWChem can use that. It is ideal for the community to have basis sets available in BSE and not just individual codes.

basis % grep "[HO]_3-21G" -A10 libraries*/3-21g libraries*/3-21gs
libraries.bse/3-21g:basis "H_3-21G" SPHERICAL
libraries.bse/3-21g-#basis SET: (3s) -> [2s]
libraries.bse/3-21g-H    S
libraries.bse/3-21g-      0.5447178000E+01       0.1562849787E+00
libraries.bse/3-21g-      0.8245472400E+00       0.9046908767E+00
libraries.bse/3-21g-H    S
libraries.bse/3-21g-      0.1831915800E+00       1.0000000
libraries.bse/3-21g-end
--
libraries.bse/3-21g:basis "O_3-21G" SPHERICAL
libraries.bse/3-21g-#basis SET: (6s,3p) -> [3s,2p]
libraries.bse/3-21g-O    S
libraries.bse/3-21g-      0.3220370000E+03       0.5923939339E-01
libraries.bse/3-21g-      0.4843080000E+02       0.3514999608E+00
libraries.bse/3-21g-      0.1042060000E+02       0.7076579210E+00
libraries.bse/3-21g-O    SP
libraries.bse/3-21g-      0.7402940000E+01      -0.4044535832E+00       0.2445861070E+00
libraries.bse/3-21g-      0.1576200000E+01       0.1221561761E+01       0.8539553735E+00
libraries.bse/3-21g-O    SP
libraries.bse/3-21g-      0.3736840000E+00       0.1000000000E+01       0.1000000000E+01
--
libraries/3-21g:basis "H_3-21G" CARTESIAN
libraries/3-21g-H    S
libraries/3-21g-      5.4471780              0.1562850
libraries/3-21g-      0.8245470              0.9046910
libraries/3-21g-H    S
libraries/3-21g-      0.1831920              1.0000000
libraries/3-21g-end
--
libraries/3-21g:basis "O_3-21G" CARTESIAN
libraries/3-21g-O    S
libraries/3-21g-    322.0370000              0.0592394
libraries/3-21g-     48.4308000              0.3515000
libraries/3-21g-     10.4206000              0.7076580
libraries/3-21g-O    SP
libraries/3-21g-      7.4029400             -0.4044530              0.2445860
libraries/3-21g-      1.5762000              1.2215600              0.8539550
libraries/3-21g-O    SP
libraries/3-21g-      0.3736840              1.0000000              1.0000000
libraries/3-21g-end
--
libraries/3-21gs:basis "H_3-21G*" CARTESIAN
libraries/3-21gs-  H    S
libraries/3-21gs-         5.447178000      0.15628500
libraries/3-21gs-         0.824547000      0.90469100
libraries/3-21gs-  H    S
libraries/3-21gs-         0.183192000      1.00000000
libraries/3-21gs-end
--
libraries/3-21gs:basis "O_3-21G*" CARTESIAN
libraries/3-21gs-  O    S
libraries/3-21gs-       322.037000000      0.05923940
libraries/3-21gs-        48.430800000      0.35150000
libraries/3-21gs-        10.420600000      0.70765800
libraries/3-21gs-  O   SP
libraries/3-21gs-         7.402940000     -0.40445300      0.24458600
libraries/3-21gs-         1.576200000      1.22156000      0.85395500
libraries/3-21gs-  O   SP
libraries/3-21gs-         0.373684000      1.00000000      1.00000000
libraries/3-21gs-end
jeffhammond commented 1 year ago

The 3-21G* basis set file I found in Dalton, which is from the original EMSL Basis Set Exchange from 2008, shows that only "Na Mg Al Si P S Cl Ar" are supported, and thus again, the above is what is expected.

Do you have a definition of the 3-21G polarization functions for other elements to which you can point us?

MarceloM-lo commented 1 year ago

Thanks!

You wrote that "the versions of 3-21G and 3-21G* in NWChem basis set library are identical for H and O". That's where the problem is. For oxygen, the 3-21G basis should include d-type polarization functions, which the 3-21G basis does not. So yes, I am saying that the 3-21G basis set is incorrect (at least for oxygen).

I'm sorry, but I'm unable to submit a basis set or even point out where one can be found. I'm a very occasional user of electronic structure calculations. This problem was noticed by a student in a beginners course in the university where I teach.

Perhaps an examination of where the 6-31G basis came from should point to an appropriate 3-21G basis for oxygen and other second-row elements? After all, 6-31G* works as it should.

jeffhammond commented 1 year ago

For oxygen, the 3-21G* basis should include d-type polarization functions, which the 3-21G basis does not.

Please provide a reference for this. I can find no evidence anywhere that 3-21G* polarization for O exists and I've looked in a lot of places.

In any case, 3-21G is a basis set from 1982. It is obsolete for anything other than rapid debugging of software. There is no reason not to use 6-31G.

jeffhammond commented 1 year ago

https://gaussian.com/basissets/ says

STO-3G and 3-21G accept a * suffix, but this does not actually add any polarization function

Given these basis sets originated with Gaussian, I consider its documentation the decisive reference on this topic.

MarceloM-lo commented 1 year ago

You wrote: "Please provide a reference for this."

For Pople basis sets like this, the symbols in the acronym carry very well defined meanings. That star (or, if you prefer, asterisk) means the following: the basis set includes polarization functions for 'heavy' atoms (ie, atoms other than hydrogen). This is so well known, it's even on wikipedia, see 'Pople basis sets' on https://en.wikipedia.org/wiki/Basisset(chemistry) . It also appears on virtually every textbook on introductory quantum chemistry.


You wrote: "I can find no evidence anywhere that 3-21G* polarization for O exists."

In NWCHem itself, the difference between the 6-31G and 6-31G* basis sets is that the latter includes polarization functions for 'heavy' atoms (including oxygen) where the former does not. This is as it should be.


You wrote: "3-21G is a basis set from 1982. It is obsolete for anything other than rapid debugging of software. There is no reason not to use 6-31G".

I would disagree with this. It can be useful in educational settings, for illustration of a number of things. For example: how basis sets are created, how the results change as basis sets improve, what makes a particular basis set more suited for one problem then another.

In my university we use it in teaching along with many other basis sets, and there's an additional use: so that students cannot simply copy work from other students, we give similar problems to many students, but ask for computations that differ in details - one of those the basis set - and ask students to critically evaluate their results, taking those details into account. I mentioned that the bug I have reported was first noticed by a student - that was a student from a class of over 50 students.

In any case, if 1982 basis sets were entirely useless, why should they be included in NWChem in 2023? Surely there must be a motivation for them to be there.


You wrote: "https://gaussian.com/basissets/ says that STO-3G and 3-21G accept a * suffix, but this does not actually add any polarization function. Given these basis sets originated with Gaussian, I consider its documentation the decisive reference on this topic."

I would again respectfully disagree, on two accounts.

First, the basis sets did not originate with Gaussian. Instead, they originated in John Pople's group - see that wikipedia page I linked above.

Second, the fact it was incorrect in Gaussian cannot make it correct, however authoritative one may consider Gaussian to be. After all, the definition of the acronym is clear. That first star carries a very clear meaning: the basis set includes polarisation functions for 'heavy' atoms.

Perhaps NWChem should simply print an error message and stop the calculation if the user requests 3-21G* for a second-row element?


Sorry for the long post!

jeffhammond commented 1 year ago

NWChem implements 3-21G the same way as every other code out there. That the does not modify 3-21G in the same way as 6-31G basis sets is not an NWChem problem. NWChem is not going to print an error message because people draw false conclusions from wikipedia or textbooks.

First, the basis sets did not originate with Gaussian. Instead, they originated in John Pople's group - see that wikipedia page I linked above.

Pople's group created Gaussian. For work that occurred prior to the Great Schism, e.g. 1982, Gaussian represents the Pople group's implementation.

MarceloM-lo commented 1 year ago

Why are you being so aggressive? I thought this was a collaborative page, where people help improve the tools we use. When my student found that problem I congratulated him. Here, I feel I'm being agressed for reporting a possible problem. Never mind, if I ever find a problem again, I'll just leave it there.

MarceloM-lo commented 1 year ago

Final comment from me.

I just checked the ORCA basis sets library. In includes 3-21G, 6-31G and 6-31G, but no 3-21G. Perhaps NWChem could copy this from one of the "codes out there"?

jeffhammond commented 1 year ago

You drew an incorrect conclusion about what 3-21G* means and decided to lecture me about the history of quantum chemistry based on Wikipedia, without knowing that Pople's group created Gaussian. I don't know what to tell you.

jeffhammond commented 1 year ago

Again, NWChem implements 3-21G correctly. You do not understand the definition of 3-21G, which is inconsistent with the definition of 6-31G*.

edoapra commented 1 year ago

From the abstract of the paper (listed in https://github.com/nwchemgit/nwchem/blob/7f3ffb0fa27326cdd9686e0394cd879f7063eebb/src/basis/libraries/3-21gs#L17-L18) W.J. Pietro, M.M. Francl, W.J. Hehre, D.J. DeFrees, J.A. Pople and J.S. Binkley, J. Am. Chem. Soc. 104, 5039 (1982) https://dx.doi.org/10.1021/ja00383a007 "The recently introduced 3-21G split-valence basis sets for second-row elements have been supplemented with functions of d-type symmetry. The resulting basis sets, termed 3-21G1*, are for use in conjuction with unsupplemented 3-21G representations for first-row elements."

npbauman commented 1 year ago

I want to clear up any confusion. There is a common notation for split-valence basis sets in which * and * denote a single added polarization function. However, this is not done for every split-valence basis set. For 6-31G, this is done. This is why you get the expected behavior when replacing 6-31G with 6-31G.

For the 3-21G basis, the procedure for adding polarization functions was only ever done for second-row heavy elements (Na, Mg, Al, Si, P, S, Cl, Ar). As described in the article that Edo mentioned, the reason was to create a basis set that was similar in construction to 6-31G, but smaller because it only adds a single polarization function for the second-row atoms and no other atoms. The resulting basis set was labeled 3-21G( ) [note the parenthesis]. As mentioned in the article, “3-21G( ), unlike 6-31G, should not be viewed as a full-polarized basis set”.

So, there is no defined 3-21G basis, only a 3-21G( ) basis. Unfortunately, for any number of reasons, quantum chemistry packages use this notation interchangeably and label the 3-21G( ) basis as 3-21G. This can be seen in the manual for the Gaussian program - “the 3-21G* basis set has polarization functions on second-row atoms only.”

In conclusion, the 3-21G basis is an exception to the * and ** notation. I hope this clears everything up. We are glad that your student has found NWChem and hope that you and your students continue to use and support NWChem.

MarceloM-lo commented 1 year ago

Thanks, @npbauman and @edoapra. Might it be a good idea to include a warning on the NWChem saying that users who request 3-21G* for Li-Ne will actually get 3-21G? I leave the suggestion. Not that important, as 3-21G will have very low usage and will be only ever used for calculations with very low stakes.