TREX-CoE / trexio_tools

Set of tools for trexio files
BSD 3-Clause "New" or "Revised" License
18 stars 8 forks source link

Updating the doc entry for MOTYPE #21

Closed neelravi closed 1 year ago

neelravi commented 1 year ago

The documentation entry for the option --motype has been updated.

Available options for GAMESS output files are: "RHF", "MCSCF", "GUGA", and "Natural"

scemama commented 1 year ago

Sorry for the delay! I thought it was already merged...

q-posev commented 1 year ago

@neelravi @scemama

Sorry, I am a bit confused here. What was the motivation of this MR? In the original code, the user at least could see possible options for mo_type.

What should they do now? A list of possible mo_type options should be documented somewhere. This line For example, GAMESS has RHF, MCSCF, GUGA, and Natural as possible MO types does not help.

neelravi commented 1 year ago

@q-posev @scemama I agree that a comprehensive list of mo_types should be made available. Currently, the help-entry shows:

-x, --motype=MO_TYPE Type of the molecular orbitals. For example, GAMESS has RHF, MCSCF, GUGA, and Natural as possible MO types.

Claudia had asked for this change. It prevents the user from giving wrong mo_type options for incompatible input file type (i.e. GUGA for molden or pyscf). The updated doc entry given only a suggestion what should go into mo_type.

I have a suggestion to eliminate this issue completely by making a small script (something like this):

import resultsFile

file = resultsFile.getFile("thiophene.g09.out")
print('recognized as', str(file).split('.')[-1].split()[0])
print(file.mo_types)

file = resultsFile.getFile("thiophene.gms.out")
print('recognized as', str(file).split('.')[-1].split()[0])
print(file.mo_types)

file = resultsFile.getFile("GAMESS_CAS.log")
print('recognized as', str(file).split('.')[-1].split()[0])
print(file.mo_types)

This will run before you run trexio convert-from and let the user see what are the available mo_types.

q-posev commented 1 year ago

Hm, still not getting this.

If RHF, MCSCF, GUGA, Natural are the only options for mo_type, they should be listed just like all other options are listed between square brackets (like the outdated [natural | initial | guga-initial | guga-natural] line that you removed). The new help message is confusing: should the user use the upper case letter for mo_type? Will natural work or only Natural? If these are only options related to GAMESS code - the corresponding logic should go inside the code and should raise a clear error when wrong combination of options is provided.

For the available mo_types for a given file: you can do a separate CLI like trexio get_mo_types <QM_FILE>. It should not happen during conversion to TREXIO.

scemama commented 1 year ago

Maybe we should define a list of possible keywords for MO types in trexio_tools, and have a dictionary to convert from our own keywords to the ones specific to the codes.

For example, we could have in trexio_tools Initial | SCF | Natural and then decide to what it corresponds. For GAMESS, SCF would correspond to RHF, ROHF, UHF or MCSCF depending on the GAMESS output. It could also correspond to Kohn-Sham orbitals for instance.