TREX-CoE / trexio_tools

Set of tools for trexio files
BSD 3-Clause "New" or "Revised" License
18 stars 8 forks source link

NAO functionality, FCIDUMP & spherical->cartesian converter #30

Closed joguenzl closed 10 months ago

joguenzl commented 10 months ago

This pull request contains a number of smaller changes.

The converter from TREXIO to FCIDUMP is implemented into the Python codebase. The Fortran implementation is removed since it did not handle core orbitals correctly and is now obsolete. The new implementation was tested extensively with FCIQMC calculations.

The conversion script from spherical harmonics to cartesian angular functions is changed. The counting of orbitals in the target format is altered so that it works if the values in the ao_shell array are not monotonous, which is an implied assumption in the current implementation. Futhermore, basis_shell_factor and basis_r_power are modified during the conversion. This breaks compatibility with older code, but is to my understanding necessary as discussed in part with @scemama. This was tested with DMC calculations using QMCkl and QMC=Chem on NAO basis sets.

Functionality for NAO basis sets is added to check_basis. check_basis is made to print the diagonal entries at the end since those are the most important ones.

Some of the commits are about a converter script to generate a TREXIO file from FHI-aims output. This converter has since been made obsolete because FHI-aims now directly generates TREXIO output. It is kept in the commit history should it be of relevance for future testing.

A TREXIO file containing the water molecule in a NAO basis set (in spherical harmonics) is provided in the attached archive for testing: water-nao-tier2.zip

q-posev commented 10 months ago

@joguenzl thank you! Great contribution :+1: @scemama are you going to review it or should I do it?

scemama commented 10 months ago

Thanks @joguenzl ! @q-posev : I will review it, but as I am involved in the project with @joguenzl , it will be good it you can also make a second pass after me.

q-posev commented 10 months ago

OK @scemama, ping me when you are done. We need to upload also test files to the data folder and adapt the CI accordingly to test the new functionalities. Would be nice also to add a rigorous test of spherical->cartesian too since to avoid regressions in the future.

q-posev commented 10 months ago

@scemama I just remembered that delete_group in TEXT back end does not remove the sparse datasets, only the per-group file. I did not find a straightforward solution back in the days because TEXT back end was already very convoluted. For HDF5 back end it should work as expected, but would be nice if you or @joguenzl can verify.

As a temporary workaround for the text back end - remove the sparse data files manually in the converter. You should be able to get the file back end from the trexio.File class attributes.

scemama commented 10 months ago

@scemama I just remembered that delete_group in TEXT back end does not remove the sparse datasets, only the per-group file. I did not find a straightforward solution back in the days because TEXT back end was already very convoluted. For HDF5 back end it should work as expected, but would be nice if you or @joguenzl can verify.

As a temporary workaround for the text back end - remove the sparse data files manually in the converter. You should be able to get the file back end from the trexio.File class attributes.

OK. Let's not do this in this PR. For the moment, we can leave

    if trexio.has_ao_2e_int_eri(inp):
      print("Warning: Two-electron integrals are not converted")
      trexio.delete_ao_2e_int_eri(out)

    if trexio.has_ao_2e_int_eri_lr(inp):
      trexio.delete_ao_2e_int_eri_lr(out)

as it was before. I will try to find a solution in TREXIO for the text backend, and then we can make the change in trexio_tools.

q-posev commented 10 months ago

OK. Let's not do this in this PR. For the moment, we can leave

It does not make sense. The way it was before is wrong and broken. It was removing the dataset (ao_2e_int_eri), but this functionality is not supported. You can only remove groups (ao_2e_int), so you need to do this:

    if trexio.has_ao_2e_int(inp):
      print("Warning: Two-electron integrals are not converted")
      trexio.delete_ao_2e_int(out)

And remove manually sparse dataset files in case of text back end.

scemama commented 10 months ago

OK. Let's not do this in this PR. For the moment, we can leave

It does not make sense. The way it was before is wrong and broken. It was removing the dataset (ao_2e_int_eri), but this functionality is not supported. You can only remove groups (ao_2e_int), so you need to do this:

    if trexio.has_ao_2e_int(inp):
      print("Warning: Two-electron integrals are not converted")
      trexio.delete_ao_2e_int(out)

And remove manually sparse dataset files in case of text back end.

OK, I didn't get it. So if we do:

    if trexio.has_ao_2e_int(inp):
      print("Warning: Two-electron integrals are not converted")
      trexio.delete_ao_2e_int(out)

and don't remove the files, the TREXIO file will be correct, and there will just be unused files present in the directory, but the group will not be accessible anymore. Correct?

q-posev commented 10 months ago

@scemama sorry for the confusion. Look at the current code (and also what you suggested above):

    if trexio.has_ao_2e_int_eri(inp):
      print("Warning: Two-electron integrals are not converted")
      trexio.delete_ao_2e_int_eri(out)

    if trexio.has_ao_2e_int_eri_lr(inp):
      trexio.delete_ao_2e_int_eri_lr(out)

The trexio.delete_ao_2e_int_eri(out) and trexio.delete_ao_2e_int_eri_lr(out) lines are wrong. If you have a look at the API - we do not have these functions (we only remove groups, not datasets). So this code will never work. The fact that this went unnoticed is because this part was never executed and Python is interpreted, not compiled. That's what I meant.

Now even if we use the existing function trexio.delete_ao_2e_int(out) instead of non-existing trexio.delete_ao_2e_int_eri(out), you still need to manually remove sparse dataset TXT files when TEXT back end is used.

scemama commented 10 months ago

@q-posev OK, so let's make the code correct by removing the group, and then we will fix the text back-end to remove the files.

joguenzl commented 10 months ago

The two points discussed above have been addressed.

q-posev commented 10 months ago

Thanks @joguenzl ! I can merge this PR as is, we have a problem in the CI due to a bug in PySCF->TREXIO converter, but this will be fixed elsewhere.