votca / xtp

GW-BSE for excited state Quantum Chemistry in a Gaussian Orbital basis, electronic spectroscopy with QM/MM, charge and energy dynamics in complex molecular systems
29 stars 15 forks source link

Reworking options #702

Closed JensWehner closed 3 years ago

JensWehner commented 3 years ago

At the moment we have the following problems:

a) The qmpackage options are dealt with in settings and are not compatible with the rest of the options parsing b) the option parsing cannot handle two identical tags, often that is exactly what you want, but for example for <segment_pairs> in neighborlist it is important c) Since the removal of packages when developer add a new option to gwbse 3-4 xml files have to be upated iqm.cml, qmmm.xml,dftgwbse.xml,eqm.xml, which is often forgotten and breaks stuff d) Users who want to change options have to copy the file from $VOTCASHARE//share/votca/xtp/xml which is not really userfriendly e) The current design does not allow for the absence of an option, e.g. omittint the <ecp> tag to not use ecps does not work instead we need an auxiliary <use_ecp> tag. This is not nice for the user.

Solutions:

1) To some extend we have to bring back packages but hide their existence from the user. This solves c) Instead of using it as the value of a tag I intend to create a attribute package= which will read the package xml from a subfolder. 2) Add a --printOptions to xtp_run xtp_tools xtp_parallel which print a nice xml-options structure, with all packages " links" resolved. This solves d) and mitigates the effects of 1) 3) remove settings class and add qmpackage options into packages. Solves a) 4) Rework how default options are modified by user options, at the moment we only look for the first occurence of a user option, potentially we might introduce another attribute allow_duplicates which specifically allows entries with the same tag. Should solve b) 5) The reason why omitting tags does not work is because we update the defaults with the user options, as we have a default value for everything, So the default value for <ecp> should be "", I am not super happy about that instead I would add to the default attribute two keywords __REQUIRED__ and __OPTIONAL__, where the former throws an error if the user has not specified a value( i.e. there is no useful default value) and the latter removes the Property for that tag unless the user specifies a value. In that case the option code could again use if(prop.exists("ecp") do...sth instead of having to do sth. else 6) We could also just use "" as the default for options that are required or optional and then let the specific code handle the case if it is optional or required. This has two disadvantages. Sometimes it might lead to error messages for the user in the middle of the calculation, which can be annoying because some options are parsed down to filters etc... and for inexperienced developers it might also be problematic, because prop.get("value").as<int>() on a "" will lead to quite unpleasant error messages.

All of it has to be then also be parsed by the Python scripts for making the website @felipeZ

Please discuss.

rubengerritsen commented 3 years ago

What do we gain from the attribute instead of the value of a tag for the package in 1? 2, 3 and 5 seem great ideas, maybe for 5 just use a bool attribute required=true for a particular setting. I agree that 6 isn't a very nice option.

I am not really sure about 4, introducing an extra attribute for this seems unnecessary, we know for which options we expect duplicates, can't we just let the code take care of those cases instead of the user?

From the overall workflow perspective the user is still confronted with a lot of different option files and even if we have the packages again, there is still, in my opinion, a problem of nested options. I am wondering can't we make the options parser smart enough to read everything from one (or at least just a few) option files. Then the overall options file would look something like


<options>
    <qmmm help="Executes qmmm calculations for individual molecules and clusters" section="sec:qmmm">
        .
        .
    </qmmm>
    <iqm help="QM calculator for pairs" section="sec:iqm">
        .
        .
    </iqm>
    <dftpackage>
        .
        .
    </dftpackage>
    <gwbse>
        .
        .
    </gwbse>
</options>

and the qmmm and the iqm take the info for the gwbse and dft part from the corresponding blocks in this file. This would keep all the info in one file, so no duplicates are needed and from a user perspective once you have created such a file you can do all your calculations within a project with the same file (fromt the dftgwbse untill the qmmm) and hence same settings. Obviously it should also be possible to leave a section out of the option file if you don't need it.

felipeZ commented 3 years ago

I like @rubengerritsen idea about using a single (or few options files). But I will some non-programmer user what they think about it :)

JensWehner commented 3 years ago

I do not think that is a good idea, iqm,eqm and qmmm and dftgwbse will probably need different gwbse options. Also that will break encapsulation of the options for calculators. Also in qmmm multiple regions might have different gwbse options.

I still prefer one option file per calculator. Also remember that with the defaults most option files will be rather small.

JensWehner commented 3 years ago

Concerning 4. This would allow us to check if a user specified an option twice.

JensWehner commented 3 years ago

Concerning 1) attribute vs tag for link, if a user copies this file for some reason it is not a valid option. Also an attribute has a name and thus is better documented, also for all other options metadata we use attributes.

rubengerritsen commented 3 years ago

I do not think that is a good idea, iqm,eqm and qmmm and dftgwbse will probably need different gwbse options. Also that will break encapsulation of the options for calculators. Also in qmmm multiple regions might have different gwbse options.

I still prefer one option file per calculator. Also remember that with the defaults most option files will be rather small.

So for me your last statement here is one of the reasons why I would like a combined file. I have in my mind for example the input file of orca, if you want to ask something extra of orca or change more defaults, you add an extra section about that part with the options and it just works. It would be great if votca-xtp could work in a similar way, so from the user perspective you always only need to worry about one file. It is almost already possible, because the different calculators open the options tag then search for their "name tag" (e.g. qmmm) and read the options from there, so you can already have the options of two different calculators in one option file, the only thing that would need to change is the "nestedness". So instead of looking for the gwbse or dft info inside the qmmm tag for example, it should start searching from the options tag and this would already work.

About point 4; if you have this as an attribute the user can change it, which might lead to errors if it is set to true for something that shouldn't have duplicates, it seems to me that we are better off handling this internally far away from the user.

JensWehner commented 3 years ago

1) I do not see the reason to agglomerate multiple small options file into one. Yes it is possible but I do not see the benefit, if qmmm needs other gwbse options than iqm, what do we do then? We loose the reproducibility because the user has to edit the options file. The "nestedness"is exactly what I think we should keep.

4) Yes but how do you we determine, if duplicate entries are allowed? We have lots of metadata in the default files already, it will not be exposed to the user, because the user will not see it using the --printòptions

rubengerritsen commented 3 years ago

Does that mean that you want to hide all the attributes from the user? Because how does the "linking" to the packages work then if you want to do that via attributes?

Maybe I don't understand well enough what you mean with

To some extend we have to bring back packages but hide their existence from the user. This solves c) Instead of using it as the value of a tag I intend to create a attribute package= which will read the package xml from a subfolder.

How would this look from the user side of things? How for example would a qmmm option file look?

JensWehner commented 3 years ago

yes , I think I have not explained it well enough.

So at the moment the files in the share directory serve two purposes:

I want to separate that. If the user does not have to copy the option files anymore, I can add lots of attributes to it to validate input etcc, without the user ever seeing it (because they use the --printoptions from the calculator). So I basically want to allow the xml files in the share directory to contain links to factor out common parts.

The user will see exactly the same options as at the moment (okay qmpackages and neighborlist and qmmm will work).