Matgenix / turbomoleio

Turbomoleio is a python package containing a set of tools for the generation of inputs and parsing of outputs for the TURBOMOLE quantum chemistry package.
https://matgenix.github.io/turbomoleio/
GNU General Public License v3.0
19 stars 9 forks source link

User defined core potentials #26

Closed wladerer closed 1 year ago

wladerer commented 2 years ago

Hi everyone,

I wanted to be able to define effective core potentials on specific atoms. I'm not the most experienced programmer, so I do not claim to have done this in the most effective/safe way possible. My implementation is almost identical to the _set_basis and _define_basis_sets methods:

     def _set_ecp(self, atom_type, ecp):
        r"""
        Set the atomic core potentials.

        Args:
            atom_type (str): defines the type type of atom(s) addressed. Can be "all",
                a list "1,2,4-6" or the atomic specie "\"c\"" (quotes are required).
            ecp (stR): the type of core potential.

        Returns:
            None
        """
        self._sendline("ecp", action="go to core potential menu")

        self._expect(
            ["ENTER A SET OF ATOMS TO WHICH YOU WANT TO ASSIGN PSEUDO POTENTIALS*"],
            action="set ecp",
        )

        self._sendline("{} {}".format(atom_type, ecp))

        case = self._expect(
            [
                "ATOMIC ATTRIBUTE DEFINITION MENU.*",
                "THERE ARE NO DATA SETS CATALOGUED IN FILE",
            ],
            action="set ecp",
        )

        if case == 1:
            raise DefineParameterError(
                "Define did not recognize the core potential {} for {}".format(ecp, atom_type)
            )

    def _define_core_potentials(self):
        """
        Define the core potentials.

        This uses the "ecp_atom" keywords in the parameters.

        Returns:
            None
        """

        if self.parameters.get("ecp_atom", None) is not None:
            for atom_type, ecp in self.parameters["ecp_atom"].items():
                # convert to string in case an integer slips through since it can
                # also represent an index.
                atom_type = str(atom_type).strip()
                # if it is a symbol and does not already contain quotations add them.
                if re.fullmatch("[A-Za-z]+", atom_type):
                    atom_type = '"{}"'.format(atom_type)

                self._set_ecp(atom_type, ecp)

This would then necessitate updating the run_full function as follows:

                try:
                    self._initialize_control(update=False)

                    # coordinate menu
                    self._geometry_menu(new_coords=True)

                    # leave the previous menu and move to the next one
                    self._switch_to_atomic_attribute_menu()

                    # atomic attribute menu - only basis set is used here
                    self._define_basis_sets()

                    # core potential definition menu
                    self._define_core_potentials()

etc
etc
...

One potential issue: When I run define through the command line interface and I try to apply an ecp to an atom that is not found in the coord file, there is no error message, it just passes nothing. I hope that this wont create an infinite loop.

Let me know if I can elaborate or make any of my points more clear. Thank you!

wladerer commented 2 years ago

I've run a test case with the implementation, i will create a fork and submit a pull request

davidwaroquiers commented 2 years ago

Hello @wladerer ,

Thanks a lot for your contribution! We will look at your pull request as soon as possible. One first comment I would make is to make a unit test and possibly an integration test. This is essential to make sure all the features continue to work in the future. This may be difficult for you to dig in but you can probably do something similar to what is done for the other inputs as well, and as you have run a test case, you can use that one for the tests. Do not hesitate to ask if you have issues. Documentation on how to implement unit tests for the DefineRunner are found here: https://matgenix.github.io/turbomoleio/developer/definerunner.html#tests And general documentation on how to implement unit and integration tests are found here: https://matgenix.github.io/turbomoleio/developer/testing.html When this is done you can also add your feature to the documentation.

About your potential issue, I would need to dig a little bit more what could be done to prevent cases where the user provides atoms that are not found. We will see about that when reviewing.

One last point (which is not strictly needed at this stage and I would consider merging without that of course), when running with specific core potentials, are there additional outputs in the log files that could/should be parsed ? In that case, it would be interesting to also add this in the output parsing module. Again, this is not essential but may be interesting to "close the loop".

Best,

David

wladerer commented 2 years ago

Hi, my pleasure!! This package has made my life tremendously better and I want to help when I can. I'm not the most experienced developer, just a chemist with some spare time. I will go about running the unit and integration tests soon.

For the "atom not found" we could perhaps catch that ourselves? If we were able to sift through the coord file or other inputs to check if that atom exists in the first place? The other thing is that the ecp nicknames are quite fickle, so it might be better to treat the error message, rather than anticipating input errors. I will think on that.

As for closing the loop, the only thing I can really think of is how the control file would output the basis set for each atom + the core potential below that entry. I will sift through some of my previous jobs to see if changing the core potential has any signature in the other output files.

wladerer commented 2 years ago

Ok, so I was unable to find a runtest.sh script in the project files. Instead, I ran

pytest --html=~/Desktop/report.html

In the root directory of the project. I hope this is an acceptable method.

I am using TURBOMOLE V6.6 from 2014, for reference.

Attached is the PDF of the report, and I will paste a google drive link to the HTML document:

https://drive.google.com/file/d/1V-Wk_T7RU9oaYlXE0Qp1TmhadwAsneTY/view?usp=sharing