Feature/alternative bands wc

bosonie commented 3 years ago

This PR implements the common bands workchain following the idea expressed by @giovannipizzi in #222.

The main concept is to have the bands common interface that accepts only 3 inputs: 1) a list of kpoints for bands 2) a remote folder where a relax/scf calculation has been run. From there all the inputs needed for the bands calculation are extracted. 3) an engines dictionary with the code and resources to use.

On top of this we create a code-agnostic workchain called RelaxAndBandsWorkChain that runs first a relaxation/scf and subsequently a bands calculation. The use of seekpath for the automatic generation of the kpoints is implemented inside this higher level workchain.

bosonie commented 3 years ago

Some notes that must be remembered: 1) This implementation requires the modification of every implementation of the CommonRelaxWorkChain in order to return in output a RemoteData containing the remote folder of the last calculation run. This is required since, otherwise, there is no common way to access the remote data and therefore the code-agnostic workchain would not be able to call the common bands interface (that requires a remote folder). 2) The implementation posed the challenge to have an abstract class for the input generator that would not be subclass of ProtoclRegistry. In fact, for the bands common interface, the protocol is not needed. My implementation is the most stupid ever. I just copied the InputGenerator class into a new InputGeneratorNoProt class and removed the dependence of InputGeneratorNoProt on ProtoclRegistry. Maybe a smartest way is possible @sphuber 3) Tests and CLI are not implemented yet.

sphuber commented 3 years ago

The implementation posed the challenge to have an abstract class for the input generator that would not be subclass of ProtoclRegistry. In fact, for the bands common interface, the protocol is not needed. My implementation is the most stupid ever. I just copied the InputGenerator class into a new InputGeneratorNoProt class and removed the dependence of InputGeneratorNoProt on ProtoclRegistry. Maybe a smartest way is possible

I think we should simply remove ProtocolRegistry as the base class of the InputGenerator abstract class. We can then simply add this as another base class in generators that actually support protocols, in this case only CommonRelaxInputGenerator, i.e.:

class CommonRelaxInputGenerator(ProtocolRegistry, InputGenerator, metaclass=abc.ABCMeta):

zhubonan commented 3 years ago

by using the RemoteFolder as the input, I guess you meant for the code to locate the original inputs though the provenance graph and then generate the actual inputs from there?

bosonie commented 3 years ago

@zhubonan yes exactly. I'm waiting a small fix (#243) before the implementation fully works, but you can start to try it out

sphuber commented 3 years ago

One potential problem with this approach. I thought the idea for this setup was that it would allow the SCF (with relax) to be performed with one code and the bands computed with another. However, that would mean that the bands workchain generator can recreate the inputs from a remote folder that is created by an arbitrary plugin, which I don't think is possible. I think it is only feasible for a plugin to determine the inputs from a remote folder created by its own common relax workflow. So I guess the approach presented here either limits cross-code use or we drop this design (even though taken by itself I prefer this as well compared to the other).

bosonie commented 3 years ago

Thanks @sphuber for the comment. I think we can overcome the issue you mention inside the code-agnostic RelaxAndBandsBandsWorkChain. First, we should remember that it is impossible (as far as I know) to pass wave-functions or density matrix from one code to another, therefore the only useful info that can be used by one code that is calculated by another code is the relaxed_structure. We can therefore allow the request of a different code for bands in input of RelaxAndBandsBandsWorkChain and implement inside the workchain this logic: perform a scf with the code for bands (it might not be the same executable but with "code" I mean here the same implementation) on the relaxed structure and using the same input of the relax input generator (a part forcing the RelaxType to NONE) of the main relaxation. Then the use of the bands common interface become the usual since it is the same code that is used.

I actually believe that this is the best solution, even better than the other bands implementation (#235), because it guarantees consistency among the used inputs of the input generator among the two steps. In the other bands implementation it was up to the user to select again the same protocol etc. that was used for the relaxation also for bands.

giovannipizzi commented 3 years ago

Hi @sphuber, regarding your comment

I thought the idea for this setup was that it would allow the SCF (with relax) to be performed with one code and the bands computed with another.

I don't think this was the original reason for this design. The reason was to avoid to repeat again all inputs for the relaxation also for the bands, meaning the implementation needs to validate in some way that it's all compatible.

As a note, I don't think running bands with a different code than SCF is really an interesting usecase. On the other hand, I just realised that this design has one flaw - it forces to run the bands as a separate step from the SCF/relax. But maybe it's anyway what all codes do, so it shouldn't be a big deal?

zhubonan commented 3 years ago

I just realised that this design has one flaw - it forces to run the bands as a separate step from the SCF/relax

Indeed, CASTEP typically runs band in single calculation without restart, but running as two separate calculations is also fine (e.g. reusing the density). I also know that for VASP doing hybrid fucntional band structure is also a single calculation.

But even if a code doesn't need the first SCF calculation, the worst case would just be performing an extra calculation, and very often the remote folder can be reused to speed up the second one. So I think this design is fine here.

bosonie commented 3 years ago

On the other hand, I just realised that this design has one flaw - it forces to run the bands as a separate step from the SCF/relax. But maybe it's anyway what all codes do, so it shouldn't be a big deal?

This is what we agreed on in the first meeting that discussed the design. We want to have a stand alone workflow for the bands, not an extension of the relaxation. In any case, if one wants to do relax+bands it should use the code agnostic workchain. Does it matter that internally it separates the steps?

However I agree with Sebastiaan that having the possibility to have relaxation with one code and bands with another is an interesting case. So I would try to support if possible, do you believe that the idea I proposed is feasible?

giovannipizzi commented 3 years ago

Does it matter that internally it separates the steps?

Only if for a code there is no easy way to actually run it in two steps (or one wants to avoid queuing 2 jobs). But at the moment this does not seem a problem (just pointing out to avoid we discover too late that the design is restrictive

Regarding using different codes:

the approach @bosonie suggests would work (here we are assuming the use of the first code is just to get a relaxed structure)
if one really wants to reuse in some way the charge density, this might stil be OK - of course the plugin of the bands needs to know how to convert formats etc, but as long as it knows what to do, it can get the RemoteData of a different code without problems, I don't see a restriction with it

sphuber commented 3 years ago

I don't think this was the original reason for this design. The reason was to avoid to repeat again all inputs for the relaxation also for the bands, meaning the implementation needs to validate in some way that it's all compatible.

That was what was described in the original summary of the design meeting. It says:

Some suggestions for the bands workflow implementation coming from the working group meeting on 28th Sept 2021 [...are] to have a standalone workflow for bands rather than simply extend the CommonRelaxWorkChain. [...] This allows to relax with one code and obtain bands with another.

This will not be directly possible with the design of this workchain. That is to say, you cannot take the last RemoteData of a CommonRelaxWorkChain run with one code, and pass that to the CommonBandsWorkChain using another code. You would have to rerun an SCF with the second code from scratch just taking the relaxed structure and pass that to the CommonBandsWorkChain. This is exactly what the wrapping code-agnostic workchain will do, so it is fine in the end. I think this should just be very clearly documented in the CommonBandsWorkChain saying that it can only ever support a RemoteData that was run with the same code as it is going to use.

if one really wants to reuse in some way the charge density, this might stil be OK - of course the plugin of the bands needs to know how to convert formats etc, but as long as it knows what to do, it can get the RemoteData of a different code without problems, I don't see a restriction with it

The problem here would not just be to make the RemoteData content compatible, but with the design of this PR, the protocol is supposed to retrieve its inputs from the original calculation that is obtained through the provenance from the RemoteData. But that is going to be code specific, and so won't work. In short, I don't think it is possible and to join different codes will have to go just through the structure.

bosonie commented 3 years ago

I think this should just be very clearly documented in the CommonBandsWorkChain saying that it can only ever support a RemoteData that was run with the same code as it is going to use.

Not only documented, by I will also add a sanity check in the implementation. Thanks @sphuber

The problem here would not just be to make the RemoteData content compatible, but with the design of this PR, the protocol is supposed to retrieve its inputs from the original calculation that is obtained through the provenance from the RemoteData. But that is going to be code specific, and so won't work. In short, I don't think it is possible and to join different codes will have to go just through the structure.

You are definitely right... However I do not see any implementation that would allow what we are trying. If I run a relaxation, then the inputs of the input generator are lost anyway. So, even with #235, and being able to translate the Density Matrix, how do I understand what other inputs I should use? Only way is that I wrote somewhere what protocol and so on I used for the relaxation. I believe this is beyond our scope, I would be happy just having a code agnostic WC that allows me to relax with one code and have bands with another, no matter how many steps are performed inside.

sphuber commented 3 years ago

I believe this is beyond our scope, I would be happy just having a code agnostic WC that allows me to relax with one code and have bands with another, no matter how many steps are performed inside.

I agree. Just to be clear, I think the design of this PR is the best solution and we should go with it. These comments are more to fully map out what the limitations of the design are and how the various workchains are intended to be used.

bosonie commented 2 years ago

From my side, and after few interaction with @sphuber, the implementation is now concluded, including the creation of a "code-agnostic" workchain that makes the relaxation first and then calculates the bands.

This workchain introduces a new general feature: exposing the inputs of an InputGenerator into a workchain. It requires the transformation of the accepted value of ports from simple python types (in InputGenerator) to Data Nodes (the only one accepted by the workchains).

The implementation does not support yet the possibility to make the relaxation with one code and the bands with another. Also it does not allow "overrides" at the moment. Separate PRs will treat this two issues. But at least this is a usable working implementation.

bosonie commented 2 years ago

The concept discussed in this PR have been implemented in several different PRs. Especially #252 and #254

aiidateam / aiida-common-workflows

Feature/alternative bands wc #242