Closed v1j4y closed 1 year ago
@scemama what is the recommended way to test the converter ?
To test the produced TREXIO files, the best solution is to run:
trexio check-mos FILE.trexio
It will compute numerically the overlap of MOs, and if your MOs are orthonormal it is easy to see that everything is OK. With the -n
option, you can increase the number of grid points.
If the calculation takes too much time, you can install QMCkl (make python-install in qmckl) to accelerate the calculation :-)
You should also run
trexio check-basis
to compare the overlap matrix in the trexio file with a numerically computed one. If they match, it means that all the conventions and normalization factors are OK. Because it is possible that a normalization factor is missing in the AOs and that you include it inside the MO coefficients, so that the MO check doesn't detect the problem.
Amazing @v1j4y, thanks!
If you have some test files, you can upload them to the data
folder of this repo and we can then modify the CI to test ORCA converter.
I see a lot of comments referring to PySCF. Is it because ORCA's notations are similar to PySCF?
OK thanks for the comments !
Currently, there's a problem with normalization of AOs which was revealed by trexio check-mos
, I'm fixing this atm.
OK ! Just FYI, everything was ok except that the coordinates had to be given in Bohrs ! This fixes the normalization issues with AOs.
Also, Dagmar Lenk has implemented ECPs in orca_2json
so I'll also add it here. :wink:
So we wait for ECPs before we merge, or you make another PR?
Yes ! I'm implementing ECPs following the pyscf to trexio functions by kosuke. I'll tell you when it's done.
@scemama I've implemented ECPs. Is there any way to check for ECPs ?
Also, I'm testing the water molecule which was used for the pyscf test by Kosuke.
Here's the output (I changed the integration domain to [40.,40.,40.]):
trexio check-basis h2o -b TEXT --n_points 500
Integration steps: [0.037033353988351204, 0.03301131648479499, 0.035256541967839575]
0 0 0.999923 1.000000
0 1 -0.344015 -0.344017
0 2 -0.157869 -0.157870
0 3 -0.000000 -0.000000
0 4 -0.000000 0.000000
0 5 0.000000 -0.000000
0 6 -0.000000 0.000000
0 7 -0.000000 -0.000000
0 8 0.000000 0.000000
0 9 -0.047843 -0.047844
0 10 -0.062517 -0.062518
0 11 -0.047843 -0.047844
0 12 -0.062517 -0.062518
1 1 1.000000 1.000000
1 2 0.788941 0.788941
1 3 -0.000000 -0.000000
1 4 -0.000000 0.000000
1 5 0.000000 0.000000
1 6 0.000000 0.000000
1 7 -0.000000 0.000000
1 8 0.000000 0.000000
1 9 0.329268 0.329268
1 10 0.391443 0.391443
1 11 0.329268 0.329268
1 12 0.391443 0.391443
2 2 1.000000 1.000000
2 3 0.000000 -0.000000
2 4 -0.000000 -0.000000
2 5 0.000000 0.000000
2 6 0.000000 0.000000
2 7 -0.000000 0.000000
2 8 -0.000000 0.000000
2 9 0.511019 0.511019
2 10 0.690325 0.690325
2 11 0.511019 0.511018
2 12 0.690325 0.690325
3 3 0.177129 1.000000
3 4 -0.000000 0.000000
3 5 -0.000000 0.000000
3 6 0.219506 0.502141
3 7 -0.000000 0.000000
3 8 -0.000000 0.000000
3 9 -0.126986 -0.025610
3 10 -0.047951 -0.009642
3 11 0.051811 -0.247551
3 12 0.019564 -0.093205
4 4 0.177129 1.000000
4 5 -0.000000 0.000000
4 6 -0.000000 0.000000
4 7 0.219506 0.502141
4 8 -0.000000 0.000000
4 9 -0.027811 -0.273671
4 10 -0.010502 -0.103040
4 11 -0.034081 0.111660
4 12 -0.012869 0.042041
5 5 0.177129 1.000000
5 6 -0.000000 0.000000
5 7 -0.000000 0.000000
5 8 0.219506 0.502141
5 9 -0.011883 -0.059936
5 10 -0.004487 -0.022567
5 11 -0.114866 -0.073448
5 12 -0.043374 -0.027654
6 6 0.904436 1.000000
6 7 -0.000000 0.000000
6 8 0.000000 0.000000
6 9 -0.575744 -0.056653
6 10 -0.361553 -0.035577
6 11 0.234908 -0.547616
6 12 0.147517 -0.343890
7 7 0.904436 1.000000
7 8 0.000000 0.000000
7 9 -0.126093 -0.605398
7 10 -0.079183 -0.380175
7 11 -0.154519 0.247007
7 12 -0.097034 0.155115
8 8 0.904436 1.000000
8 9 -0.053878 -0.132588
8 10 -0.033834 -0.083262
8 11 -0.520793 -0.162478
8 12 -0.327045 -0.102032
9 9 1.000000 1.000000
9 10 0.684800 0.684800
9 11 0.120869 0.120869
9 12 0.307324 0.307324
10 10 1.000000 1.000000
10 11 0.307324 0.307324
10 12 0.606683 0.606683
11 11 1.000000 1.000000
11 12 0.684800 0.684800
12 12 1.000000 1.000000
Norm of the error: 2.624232
Does this look OK or does it need more points ? Is the error big ?
@scemama I've implemented ECPs. Is there any way to check for ECPs ?
No. The simplest way is to import the TREXIO file in another code, and check that the energy is correct. You can do it with quantum package with qp_import_trexio
, and the qp run print_energy
.
Does this look OK or does it need more points ? Is the error big ?
It is hard to tell... If you install the python interface to QMCkl, trexio-tools will use it to compute the AOs and MOs, so it will be orders of magnitude faster. Then, you can increase the number of points for the integration, ans see if it converges towards an identity matrix. But you can also compute the energy with quantum package to cross-check ;-)
Hi !
@scemama I'm almost done, I get the same energies as orca after doing qp_import_trexio
, and then qp run print_energy
for files without ECPs. I've also added an orca file to test the conversion h2o.json
. I'm now testing the ECPs with orca.
@q-posev Do I need to do anything else for the CI ? How do I set up the CI ?
Bravo @v1j4y !
The CI is already in place, see test.yml file. As I wrote above, for the CI we would need you to upload one (or a few) test ORCA JSON file(s) to the data
folder of this repo and then we can add the corresponding test lines (calling trexio convert-from -t orca ...
) to the CI. Unless there is a licensing issue which does not allow storage of ORCA JSON files in the open source repos.
For now we only test that converters "work" (i.e. they do not raise errors while processing output of a q-chem program and they do produce the TREXIO file in the end). The validity of the TREXIO file is only checked for GAMESS for now.
@scemama this is something related to non-regression of the converters which worries me sometimes (see my comments in #12), perhaps at some point we want to make sure that produced TREXIO files are valid and have not changed drastically compared to some fixed ref. We could do h5diff
on TREXIO HDF5 files or we could do check-basis
and check-mos
in the CI for all converters. What do you think?
@v1j4y Let me know if you need help with the CI.
@scemama this is something related to non-regression of the converters which worries me sometimes (see my comments in https://github.com/TREX-CoE/trexio_tools/issues/12), perhaps at some point we want to make sure that produced TREXIO files are valid and have not changed drastically compared to some fixed ref. We could do h5diff on TREXIO HDF5 files or we could do check-basis and check-mos in the CI for all converters. What do you think?
@q-posev I think it is a good idea to check that the files are OK. But I would use the text backend for that with diff, as text files are better handled with git.
I'm adding the ORCA test to CI on master
.
Thanks @q-posev !
Added everything except ECPs.
Converter uses
filename.json
orca data repository created usingorca_2json
to obtain the correspondingtrexio_file.h5
file. Note that theorca_2json
is the long term supported data repository generator for ORCA.orca_2json
. This will need to be implemented in ORCA before we can implement it here.