aiidateam / aiida-common-workflows

A repository for the implementation of common workflow interfaces across materials-science codes and plugins
https://aiida-common-workflows.readthedocs.io
MIT License
52 stars 31 forks source link

Adding a `molecule` protocol #99

Open zooks97 opened 3 years ago

zooks97 commented 3 years ago

Samuel and I have been fighting with Abinit to do the H2 and CH3 molecule calculations, and many situations arose where we wished there was some way we could know if the structure in question was a molecule or not.

For example, when simulating molecular systems, we would like to adjust the kinetic-energy cutoff, add a macroscopic dielectric constant, and in general treat the system quite differently to how a true periodic system would be treated.

This treatment is necessary simply because Abinit has not put significant development time to ensuring that systems like molecules behave well in their code because, frankly, it's not a common use case.

The solution we propose is to add a protocol for the Abinit relax workchain called molecule or similar which does the above adjustments and, in our testing, behaves much better for the example systems. However, this does entail adding molecule as an accepted protocol input to the CLI, and as this protocol may not be need to be supported by all codes, it could cause some frustrating errors to users.

Would it be possible for us to add this protocol? Is anyone else facing a similar situation with the H2 and CH3 examples?

sphuber commented 3 years ago

Didn't we agree that for the molecules, we would set the pbc to [False, False, False] on the StructureData node? If so, you could simply detect that in your code and adjust the protocol settings accordingly

bosonie commented 3 years ago

Didn't we agree that for the molecules, we would set the pbc to [False, False, False] on the StructureData node? If so, you could simply detect that in your code and adjust the protocol settings accordingly

This was my suggestion, however, at the end, we did not agree on anything, hoping that setting a big enough cell was sufficient to run smoothly for all codes. Apparently this is not the case for abinit and some more tweaks are needed. I still believe that signaling molecules with pbc to [False, False, False] is the correct way to go and we can agree on that in the next meeting. This means that abinit needs to modify their plugin to allow pbc = [False, False, False]. This was another problem brought up the last time as far as I understood. To create a separate protocol, in my opinion, should be avoided, in order to maintain the interface consistent for all plugins.

broeder-j commented 3 years ago

I can tell that for fleur we face the same problems for the H2 and NH3 cases. You get some result, but it is certainly not a good one, or one I would trust. If you do it carefully one can apparently get results which are ok, but doing so is beyond a simple protocol and a 'dumb' plugin. Also I do not want to invest a lot of time implementing things that nobody will use in the end, since this is not a common usecase. For me it is fine to say, look the workflow produces something but you should not use it for this that way. If you want to do it the proper way is different. Also for H2 dissociation it is known that beyond DFT methods give results which are more correct. So calculating something which is known to be bad in a simpl

So, pbc (False, False, False) would be enough to act on, for me and implement an internal 'molecule protocol'. But when doing so I would have to change the StructureData node before given it to FLEUR, add pbc (True, True, True) and vacuum box.

bosonie commented 3 years ago

Also for H2 dissociation it is known that beyond DFT methods give results which are more correct. So calculating something which is known to be bad in a simpl

In fact that's something that we might think to show: we can use the exact same workflow with orca and gaussian with a different level of theory and compare with the correct curve. If not, we will show only the results of the energy minimum, which is good enough in DFT

So, pbc (False, False, False) would be enough to act on, for me and implement an internal 'molecule protocol'. But when doing so I would have to change the StructureData node before given it to FLEUR, add pbc (True, True, True) and vacuum box.

I thought we agreed on a box 15,15,15 because it was big enough for everybody. Also, I believe more and more that not accepting pbc (False, False, False) is a problem of the plugin and the correct way to fix it is to manage it internally. Molecules are very well defined physical entities and they have pbc (False, False, False). Any plugin should accept it. If then your code needs to set PBC to run molecules, then internally your plugin should decide what to do. For instance it will set pbc in the input file of the code and through a warning or something similar. That's my idea and I'll probably implement it in my Siesta plugin

broeder-j commented 3 years ago

I thought we agreed on a box 15,15,15 because it was big enough for everybody. Also, I believe more and more that not accepting pbc (False, False, False) is a problem of the plugin and the correct way to fix it is to manage it internally. Molecules are very well defined physical entities and they have pbc (False, False, False). Any plugin should accept it. If then your code needs to set PBC to run molecules, then internally your plugin should decide what to do. For instance it will set pbc in the input file of the code and through a warning or something similar. That's my idea and I'll probably implement it in my Siesta plugin

I disagree partly with this, because the plugin should be as dumb as possible. I.e if you give it a StructureData on whatever the plugin will NOT change it before given it to the code. Meaning it is the users responsibility to prepare the structure right, i.e set films, or molecules up the right way for a given code to be useable. And therefore this part would go into the code specific input builder of the common workflows (or external tools) and not in the plugin. The h2 box is currently still 1x1x1 and (True, True, True), or?.

With films we would face much larger issues, because there are many more possible setups. Also there might be a certain setup that is best for the a given code, but that still does not has to mean there are no occasion a user wants to do something different. Therefore always auto doing something in the plugin is in my view a bad thing, because it would restrict the user in the end.

Also there are certain features in StructureData that certain plugins will not support, i.e partial occupations, explicit vacancies, or having several atoms on a given Site. So in my view it is better to strictly reject things in the plugin which are surely not supported then assuming something.

bosonie commented 3 years ago

I disagree partly with this, because the plugin should be as dumb as possible. I.e if you give it a StructureData on whatever the plugin will NOT change it before given it to the code. Meaning it is the users responsibility to prepare the structure right, i.e set films, or molecules up the right way for a given code to be useable. And therefore this part would go into the code specific input builder of the common workflows (or external tools) and not in the plugin.

I'm not suggesting to change the structure. However, many codes claim to support calculation on molecules but do not allow pbc (False, False, False) in any form. In this case seems more logic to me to treat the structure in the same way whether pbc (False, False, False) or pbc (True, True, True) is set in the StructureData. If you want, PBC is a concept that does exist for the StructureData but not for the code. To rise error if pbc (False, False, False) is an equally valid choice I guess. But we go back to the same problem: to set pbc to (True,True,True) in our test cases will then impede to distinguish real molecule from else, and then impede to tune the parameters accordingly.

The idea of a separate protocol it seems a bit against the idea to maintain the interface consistent for all plugins. Another idea might work though. How about an ElectronicType MOLECULES?

The h2 box is currently still 1x1x1 and (True, True, True), or?.

It should be 15 15 15 and (False, False, False).