TREX-CoE / trexio

TREX I/O library
https://trex-coe.github.io/trexio/
BSD 3-Clause "New" or "Revised" License
49 stars 14 forks source link

Example for determinants #167

Open scemama opened 3 weeks ago

scemama commented 3 weeks ago

Hello,

I think it might also be beneficial for users to have a direct example showing how to extract and register the determinants from a PySCF script.

For instance, I not only needed to confirm the 1-based indexing for the determinant list, but also to ensure that my TREXIO file contained the correct number of states (which, to be honest, was not obvious in my case since I am working with the ground state, and this section in the TREXIO documentation pertains to excited states).

The example determinant file you already have does not mention this and might not provide enough detail on how to do this from PySCF. Additionally, converting to a bit-field format may not be obvious depending on the user's level of knowledge. I believe you could directly use my PySCF script (which is highly inspired from your ResultsFIle work) related to this part, which is attached.

# Determinans in Trexio
from trexio_tools.group_tools import determinant as trexio_det #To convert in bitfiled format
int64_num = int((mo_num-1)/64) + 1 

det_list = []
for occsa, occsb, _, _ in data_all_sorted: #occsa represent the alpha orbitals list whereas occsb is the beta one
  occsa_upshifted = [orb + 1 for orb in occsa]  #need +1 because pyscf orbital list is 0 based
  occsb_upshifted = [orb + 1 for orb in occsb]
  det_tmp     = []
  det_tmp    += trexio_det.to_determinant_list(occsa_upshifted, int64_num)
  det_tmp    += trexio_det.to_determinant_list(occsb_upshifted, int64_num)
  det_list.append(det_tmp)

offset_file = 0 
trexio_file.set_state(0)  #Needed otherwise qp2 export failed
trexio.write_state_num(trexio_file,1)  #Needed otherwise qp2 export failed
n_chunks = 1 
for _ in range(n_chunks):
  trexio.write_determinant_list(trexio_file, offset_file, num_determinants, det_list)
  offset_file += n_chunks

ci_coeff_list = []
for occsa, occsb, ci_coeff, _ in data_all_sorted:
    ci_coeff_list.append(ci_coeff)

if len(ci_coeff_list) == num_determinants:
    dset = np.array(ci_coeff_list)  
    trexio.write_determinant_coefficient(trexio_file, offset_file, num_determinants, dset)

Best

_Originally posted by @NastaMauger in https://github.com/TREX-CoE/trexio_tools/issues/45#issuecomment-2466276949_

q-posev commented 3 weeks ago

I think it would make sense to include a complete example (including the call to CASSCF/CASCI module of PySCF) and not the truncated part. Or even better - show how to write the HF wave function first and the CI one second so that we have a complete pipeline. Then we can do the same for QP2 so that the users have concrete use cases for CI I/O in two different codes.

NastaMauger commented 3 weeks ago

@q-posev Totally agree but I think it would make more sense to register the mcscf object is not it ?

I am trying to extract data which make sense between mcscf object and trexio. In the meanwhile I can create this two examples!

q-posev commented 3 weeks ago

Indeed, I think what @scemama meant is adding an example using low-level PySCF functionalities like you did in trexio-tools (accessing outputs/internals of PySCF objects). But PySCF users can use high-level functions operating on mcscf or other PySCF-native objects. My main concern is that if there are importatnt bug fixes/additions going into pyscf-forge - then TREXIO documentation will be outdated. So ideally examples in TREXIO documentation should point to PySCF documentation in the future in order to stay in sync :-)

q-posev commented 3 weeks ago

@scemama @NastaMauger we have an example of determinants I/O in the dedicated trexio-tutorials repo here. I believe any examples should go there.

NastaMauger commented 2 weeks ago

Here are my scripts, which export all necessary information from RHF to trexio and also create a det.hdf5 file using specific functions. Does this meet your requirements regarding the example documentation? Since this repo does not use the pyscf-forge one, it was not easy to make this script as small as possible until all functionnalities are merged into pyscf-forge. det_pyscf_to_trexio.txt det_pyscf_to_trexio_qp2.txt

PS: Both script are compatible with the 0-based indices and the electron group infos.

q-posev commented 2 weeks ago

Thank you for sharing the scripts @NastaMauger !

For the pedagogical purposes, it might be better to have your script converted into a Jupyter notebook. And I noticed that some functions are outdated (e.g. read_det_trexio has filename input argument but uses trexio_file internally, undefined occsa_upshifted), so I am not sure that your example will work. Can you check please?

On the other hand, not sure if we want to keep the call to trexio-tools as in your example (assuming that pyscf-forge integration will be completed soon and our local converter will be deprecated). If you use the recently implemented function from pyscf-forge PR, do you obtain the same results? I know that their function is more advanced than ours (e.g. it produces 2-electron integrals) so perhaps we should stick to it.

NastaMauger commented 2 weeks ago

@q-posev Sorry, I'm getting confused between trexio, trexio-tools and pyscf-forge/trexio I’ve written two examples (CASCI and FCI) that use my PR from pyscf-forge available here. However, since it hasn't been merged yet, I wrote the two previous scripts in this thread with all the functions defined inside. Maybe these two examples from my PR can be used here ?

Let me know if it suits your requirement

q-posev commented 2 weeks ago

@NastaMauger Yes, I am getting lost with all the discussions in parallel too :-) I would prefer to finalize your pyscf-forge PR first (because there might be changes introduced in the function signatures etc.) and then adapt your selected CI and FCI scripts (or maybe make one script with an optional sCI/FCI switch).

Then we will put it in the trexio-tutorials and make sure that it finds its way into the official TREXIO documentation!

NastaMauger commented 2 weeks ago

@q-posev The PR has been merged. Do the two examples attached to the PR description meet your requirements for the example you would like to include in this repo?

q-posev commented 2 weeks ago

Thank you @NastaMauger for your very important contribution on this!

Would it be possible to make your two scripts from the PR into one script with a conditional if/else block to choose between FCI and selected CI? Something like:

if do_fci:
    <your code for pyscf FCI preparation and kernel here>
elif do_sci:
   <your code for pyscf selected CI preparation and kernel here>

This way it will be easier to convert into a Jupyter notebook.

Would you mind creating a PR in the trexio-tutorials repo and not here? Once we have a notebook there - I will make necessary changes so that TREXIO documentation webpage is synced with trexio-tutorials. Thanks!

NastaMauger commented 2 weeks ago

@q-posev Sure. I have never done jupyter stuff. Let me check that so I can do a clean notebook for these new features !

q-posev commented 2 weeks ago

@NastaMauger thank you! Don't worry about jupyter, if you only have a working python script with the aforementioned changes - i can transform it into a notebook myself. We store notebooks in the markdown format and not in the conventional ipynb format because the former is git-friendly, unlike the latter. There is a section in trexio-tutorials README about that if you are interested.