`BasePwCpInputGenerator`: Remove superfluous validation

mbercx commented 1 year ago

In the prepare_for_submission script of the BasePwCpInputGenerator, there still is some validation for the pseudopotentials (see that the elements match those in the provided structure) as well as the FIXED_COORDS setting.

In general, we want to do all validation using validators of the appropriate port or port name space. In the case of these two inputs, however, this validation is already done by proper validators, so we can simply remove it.

Even in case the structure port is not in the name space because a wrapping work chain excluded it when exposing the inputs, the validation will still occur when the process is submitted in the steps of the work chain.

mbercx commented 1 year ago

@sphuber I just opened this while working on #922. Looking into this raised an interesting question regarding validation. Just a few notes to discuss:

Ideally validation should happen before a process is run, instead of excepting it, i.e. via a proper validator.
Validation via a validator is only possible if the input is provided (obviously).
Some inputs are only provided during runtime by work chains that wrap the process.

This has already led to issues in https://github.com/aiidateam/aiida-quantumespresso/pull/722. Here we wanted to check that a parent_folder is provided for PwCalculations where a restart from the charge density is needed (nscf/bands). This is a somewhat unique type of validation, since we want to check if a certain input is present. Our solution there was a bit imperfect because it requires every work chains that wraps the PwCalculation to override its validator to PwCalculation.validate_inputs_base. I think I may have a better approach for this now, see further below.

🤔 To `validator` or not to `validator`

Here I was wondering about a different conundrum. I wanted to remove the following code:

https://github.com/aiidateam/aiida-quantumespresso/blob/7f4c4a108b09c31cfc9ee252d7e8fbe574ea477f/src/aiida_quantumespresso/calculations/__init__.py#L216-L224

Since we are already validating this in a proper validator here:

https://github.com/aiidateam/aiida-quantumespresso/blob/7f4c4a108b09c31cfc9ee252d7e8fbe574ea477f/src/aiida_quantumespresso/calculations/__init__.py#L161-L165

And in accordance with [1], using a validator is preferable. However, I realised that this is only checked in case the structure input is actually provided. So now I was thinking about what to do.

My first inclination was that a wrapping work chain should take care of the validation. But this would of course lead to a lot of duplication of code. I.e. we don't want that the PwBandsWorkChain also has to check that the top-level structure elements all have a pseudo defined in each wrapped PwCalculation. This logic should be contained on the PwCalculation.

But following [1] and only using the validator means that the validation doesn't happen. At first I then thought that we should perhaps keep the code I'm removing here, so the excepted wrapping process still shows a more informative exception traceback. However, that isn't necessary, since when the wrapping work chain tries to submit the process, the "proper" validation still occurs (see the long traceback below).

Full Traceback

```Python ❯ verdi process report 121 2023-05-05 16:33:48 [8 | REPORT]: [121|PwRelaxWorkChain|on_except]: Traceback (most recent call last): File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/process_states.py", line 228, in execute result = self.run_fn(*self.args, **self.kwargs) File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/aiida/engine/processes/workchains/workchain.py", line 314, in _do_step finished, stepper_result = self._stepper.step() File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/workchains.py", line 295, in step finished, result = self._child_stepper.step() File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/workchains.py", line 538, in step finished, result = self._child_stepper.step() File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/workchains.py", line 295, in step finished, result = self._child_stepper.step() File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/workchains.py", line 246, in step return True, self._fn(self._workchain) File "/Users/mbercx/project/qe/code/aiida-quantumespresso/src/aiida_quantumespresso/workflows/pw/relax.py", line 243, in run_relax running = self.submit(PwBaseWorkChain, **inputs) File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/aiida/engine/processes/process.py", line 534, in submit return self.runner.submit(process, **kwargs) File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/aiida/engine/runners.py", line 183, in submit process_inited = self.instantiate_process(process, **inputs) File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/aiida/engine/runners.py", line 169, in instantiate_process return instantiate_process(self, process, **inputs) File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/aiida/engine/utils.py", line 64, in instantiate_process process = process_class(runner=runner, inputs=inputs) File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 194, in __call__ inst.transition_to(inst.create_initial_state()) File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 339, in transition_to self.transition_failed(initial_state_label, label, *sys.exc_info()[1:]) File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/processes.py", line 1003, in transition_failed raise exception.with_traceback(trace) File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 324, in transition_to self._enter_next_state(new_state) File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 384, in _enter_next_state self._fire_state_event(StateEventHook.ENTERING_STATE, next_state) File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/base/state_machine.py", line 300, in _fire_state_event callback(self, hook, state) File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/processes.py", line 329, in lambda _s, _h, state: self.on_entering(cast(process_states.State, state)), File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/processes.py", line 683, in on_entering call_with_super_check(self.on_create) File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/base/utils.py", line 29, in call_with_super_check wrapped(*args, **kwargs) File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/aiida/engine/processes/process.py", line 397, in on_create super().on_create() File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/base/utils.py", line 16, in wrapper wrapped(self, *args, **kwargs) File "/Users/mbercx/.virtualenvs/qe/lib/python3.9/site-packages/plumpy/processes.py", line 750, in on_create raise ValueError(result) ValueError: Error occurred validating port 'inputs.pw': The `pseudos` specified and structure kinds do not match: {'mama'} vs {'Si'} 2023-05-05 16:33:48 [9 | REPORT]: [121|PwRelaxWorkChain|on_terminated]: remote folders will not be cleaned ```

So this long rant is basically to reconfirm that:

We should never validate by raising exceptions in the prepare_for_submission script.

Unless there is anything I'm missing? I.e. is there a use case where having validation in the prepare_for_submission script is preferable?

As a final note, the traceback could be made more succinct/clear. Perhaps we can have validators return a generic AiiDA ValidationError, which is caught in submit/run calls and simply returns a nicely formatted (cough using rich cough) message.

🔀 Dealing with complex validations

I already mentioned in the intro that sometimes validations can be tricky because of [2-3]. To repeat the example (repurposing the number refs here):

We want to check that a parent_folder is present when CONTROL.calculation is e.g. nscf.
The parent_folder will not be present in the inputs of e.g. the PwBandsWorkChain bands step, since it is only created on runtime.
The validation will hence always fail, raising a warning when the builder is put together using the get_builder_from_protocol() method.

Our solution at the time was to split up the validation in a base and full one, and override the validator. However, after seeing this nice example of how the port namespace can be inspected:

https://github.com/aiidateam/aiida-quantumespresso/blob/7f4c4a108b09c31cfc9ee252d7e8fbe574ea477f/src/aiida_quantumespresso/calculations/__init__.py#L152-L153

I now think it'd be better to check if the parent_folder is in the name space, and only do the restart validation in this case. Wrapping work chains that dynamically assign this input should exclude the parent_folder port when exposing the inputs of the PwBaseWorkChain anyways. I've implemented an example of this in https://github.com/aiidateam/aiida-quantumespresso/pull/927.

I'm going to write down these different use cases and how to best deal with them in a tutorial. Happy to have your input @sphuber!

sphuber commented 1 year ago

I fully agree with your thought process:

Ideally, all validation is done in validators not in prepare_for_submission.
Validators should use the ctx (i.e. PortNamespace that is passed) to ensure the relevant input port is still present, and otherwise skip validation of that port. This will make validation robust with respect to wrapping workchains excluding certain ports.
This approach will indeed remove certain validation at launch time of the outermost layer, but as you said, eventually each subprocess will be fully validated when it is launched.

Conclusion: ok to remove this validation in prepare_for_submission.

mbercx commented 1 year ago

Thanks @sphuber! Note that this does mean that the validation will only occur in the step where the wrapped process is launched, which means that potentially multiple calculations will have been run in the work chain before the user finds out that a provided input was wrong. In the case of the structure and pseudos, the work chain could have in principle check this. But I believe we both agree that trying to do this for each work chain is not sustainable, unless we can somehow automate this. That said, if the user is aware of get_builder_restart and caching, not much work will be lost.

More importantly though:

As a final note, the traceback could be made more succinct/clear. Perhaps we can have validators return a generic AiiDA ValidationError, which is caught in submit/run calls and simply returns a nicely formatted (cough using rich cough) message.

What do you think about this? Could we somehow make the traceback of validation failures more readable? 😇

sphuber commented 1 year ago

But I believe we both agree that trying to do this for each work chain is not sustainable, unless we can somehow automate this. That said, if the user is aware of get_builder_restart and caching, not much work will be lost.

mbercx commented 1 year ago

One final question: do you agree we should no longer add the PR number to the commit titles, since this information (and link) is already found easily when looking at the commit (see bottom left below), and the PR number is actually GitHub-specific?

Screenshot 2023-05-05 at 23 01 42

sphuber commented 1 year ago

Yeah, fine for me to start omitting those. We already transitioned to using commit hashes in the CHANGELOG right?

mbercx commented 1 year ago

We already transitioned to using commit hashes in the CHANGELOG right?

Yup! And with the PR number still easily findable on the commit page, there is no reason to keep them in the commit title I think.

aiidateam / aiida-quantumespresso