Closed mbercx closed 1 year ago
@sphuber I just opened this while working on #922. Looking into this raised an interesting question regarding validation. Just a few notes to discuss:
This has already led to issues in https://github.com/aiidateam/aiida-quantumespresso/pull/722. Here we wanted to check that a parent_folder
is provided for PwCalculation
s where a restart from the charge density is needed (nscf
/bands
). This is a somewhat unique type of validation, since we want to check if a certain input is present. Our solution there was a bit imperfect because it requires every work chains that wraps the PwCalculation
to override its validator to PwCalculation.validate_inputs_base
. I think I may have a better approach for this now, see further below.
validator
or not to validator
Here I was wondering about a different conundrum. I wanted to remove the following code:
Since we are already validating this in a proper validator here:
And in accordance with [1], using a validator is preferable. However, I realised that this is only checked in case the structure
input is actually provided. So now I was thinking about what to do.
My first inclination was that a wrapping work chain should take care of the validation. But this would of course lead to a lot of duplication of code. I.e. we don't want that the PwBandsWorkChain
also has to check that the top-level structure elements all have a pseudo defined in each wrapped PwCalculation
. This logic should be contained on the PwCalculation
.
But following [1] and only using the validator means that the validation doesn't happen. At first I then thought that we should perhaps keep the code I'm removing here, so the excepted wrapping process still shows a more informative exception traceback. However, that isn't necessary, since when the wrapping work chain tries to submit
the process, the "proper" validation still occurs (see the long traceback below).
So this long rant is basically to reconfirm that:
We should never validate by raising exceptions in the prepare_for_submission
script.
Unless there is anything I'm missing? I.e. is there a use case where having validation in the prepare_for_submission
script is preferable?
As a final note, the traceback could be made more succinct/clear. Perhaps we can have validators return a generic AiiDA ValidationError
, which is caught in submit
/run
calls and simply returns a nicely formatted (cough using rich
cough) message.
I already mentioned in the intro that sometimes validations can be tricky because of [2-3]. To repeat the example (repurposing the number refs here):
parent_folder
is present when CONTROL.calculation
is e.g. nscf
.parent_folder
will not be present in the inputs of e.g. the PwBandsWorkChain
bands
step, since it is only created on runtime.get_builder_from_protocol()
method.Our solution at the time was to split up the validation in a base and full one, and override the validator. However, after seeing this nice example of how the port namespace can be inspected:
I now think it'd be better to check if the parent_folder
is in the name space, and only do the restart validation in this case. Wrapping work chains that dynamically assign this input should exclude the parent_folder
port when exposing the inputs of the PwBaseWorkChain
anyways. I've implemented an example of this in https://github.com/aiidateam/aiida-quantumespresso/pull/927.
I'm going to write down these different use cases and how to best deal with them in a tutorial. Happy to have your input @sphuber!
I fully agree with your thought process:
prepare_for_submission
.ctx
(i.e. PortNamespace
that is passed) to ensure the relevant input port is still present, and otherwise skip validation of that port. This will make validation robust with respect to wrapping workchains excluding certain ports.Conclusion: ok to remove this validation in prepare_for_submission
.
Thanks @sphuber! Note that this does mean that the validation will only occur in the step where the wrapped process is launched, which means that potentially multiple calculations will have been run in the work chain before the user finds out that a provided input was wrong. In the case of the structure and pseudos, the work chain could have in principle check this. But I believe we both agree that trying to do this for each work chain is not sustainable, unless we can somehow automate this. That said, if the user is aware of get_builder_restart
and caching, not much work will be lost.
More importantly though:
As a final note, the traceback could be made more succinct/clear. Perhaps we can have validators return a generic AiiDA ValidationError, which is caught in submit/run calls and simply returns a nicely formatted (cough using rich cough) message.
What do you think about this? Could we somehow make the traceback of validation failures more readable? 😇
But I believe we both agree that trying to do this for each work chain is not sustainable, unless we can somehow automate this. That said, if the user is aware of get_builder_restart and caching, not much work will be lost.
One final question: do you agree we should no longer add the PR number to the commit titles, since this information (and link) is already found easily when looking at the commit (see bottom left below), and the PR number is actually GitHub-specific?
Yeah, fine for me to start omitting those. We already transitioned to using commit hashes in the CHANGELOG right?
We already transitioned to using commit hashes in the CHANGELOG right?
Yup! And with the PR number still easily findable on the commit page, there is no reason to keep them in the commit title I think.
In the
prepare_for_submission
script of theBasePwCpInputGenerator
, there still is some validation for the pseudopotentials (see that the elements match those in the providedstructure
) as well as theFIXED_COORDS
setting.In general, we want to do all validation using validators of the appropriate port or port name space. In the case of these two inputs, however, this validation is already done by proper validators, so we can simply remove it.
Even in case the
structure
port is not in the name space because a wrapping work chain excluded it when exposing the inputs, the validation will still occur when the process is submitted in the steps of the work chain.