Closed bosonie closed 3 years ago
In other words, there is no strict need to separate ProcessInputsIterator and GeneralIterator. The GeneralIterator alone would be enough for any use case. However the separation helps the documentation of the classes and will facilitate the discussion. I felt, in fact, that the approach of using _parameters_lookup keyword to decide the allowed parameters should be discussed separately compared to the actual iteration mechanism.
I feel like there is no need to create two different classes for this. The ProcessInputsIterator
is just a GeneralIterator
with an empty _params_lookup
. Even by just seeing the names, one would tell that GeneralIterator
is the parent class and ProcessInputsIterator
is a child class that makes the functionality more specific. So I would say that if you want to make two separate classes, ProcessInputsIterator
should inherit from GeneralIterator
and force that _params_lookup
is always empty.
But as I said, I think this adds complexity and uncertainty for a developer that wants to use it, as there could be just one class called BaseIterator
, whose default is anyway to be a ProcessInputsIterator
. I don't see the advantage of having two classes, could you try to convince me? :)
Note:
class MyIterator(ProcessInputsIterator):
pass
restricts any class that inherits from MyIterator
to only iterate over inputs. (unnecessarily, in my opinion)
I completely agree with you, in aiida_siesta there won't be a separation. However, as I already said, I decided to do like that to help the understanding for people in aiida_core and separate the discussion on how to manage supported iteration parameters. Your idea of the _params_lookup
is very cool, but there are millions more ways to reach the same scope. I just hope this structure will help the understanding...the _params_lookup
adds a layer of complexity that must be discussed separately. I couldn't myself exactly understand what _params_lookup
requires or can have as optional. For this I left to you the description in the docstring.
But even if you want to make the separation, I think GeneralIterator
is the generalization, and ProcessInputsIterator
is a specific use of it. So the inheritance should go the other way around.
In my opinion, GeneralIterator
should not impose that _params_lookup
is not empty. It is general and therefore it should be possible to use for any case. It should be ProcessInputsIterator
that inherits from GeneralIterator
and forces _params_lookup
to be empty.
I can write a good explanation of how _params_lookup
(or a different approach that you prefer) works, but there's one thing that is simple enough: by default, you can only iterate over inputs of the process.
If one creates workchains inheriting from GeneralIterator
(and remove the restriction on _params_lookup
), the available parameters can always be changed in higher level classes:
class MyIterator(GeneralIterator):
_process_class = ...
pass
class My2ndIterator(MyIterator):
_params_lookup = (
("Cool params", {...}),
)
class My3rdIterator(My2ndIterator):
_params_lookup = ()
this is not possible if you make the first one inherit from ProcessInputsIterator
because you think no one will never need other parameters and you have a GeneralIterator
that imposes restrictions.
Ok ok, I will make the change and leave just the general iterator. I will leave to you the duty to write properly the documentation. All the rest is good?? After the modifications I would like to merge.
For the little that I've seen of aiida, it seems like GeneralIterator
should be called BaseIterator
or BaseIteratorWorkchain
, or anything with the word Base
for consistency with the rest of the naming. What's your feeling on this?
Yes, I was actually wondering. I would go for BaseIterator
for the moment. I would instead leave BasicConverger
as it is. I imagine there can be more sophisticated converges where the convergence check is done differently.
Oh and also, I think one last thing to do if we are planning to propose this for aiida core is to make sure that the parameters of the iterator can be modified from within the class in an easy and simple way.
For example, imagine you create an equation of state workflow like the one you have:
class EOS(Iterator):
...
In this case, iterate_over
should not be required, since the workchain already has some defaults for the scales
parameter, which is the only one that you need. And somehow, the workchain should be able to inject scales
into the iterator (I'm sure the correct place to do it is in _parse_iterate_over
).
Can I give a go at converting the EOS workflow using the iterator? This would be the ultimate proof that the iterator is actually useful for developing (it takes away the boilerplate code to iterate and create batches in a consistent way, and since the features are centralized, an improvement to the BaseIterator
would be reflected on all its childs).
Shouldn't it be already possible without modifying the BaseIterator
??? you just have to define in _params_lookup
the support for the "scale", no??
Yes you can of course, maybe in another PR.
Unfortunately now I have to go out. I'll merge it tonight after copying the future_ into iterate.py and converge.py. Ok??
Shouldn't it be already possible without modifying the BaseIterator??? you just have to define in params_lookup the support for the "scale", no??
Yes, you can do that, but currently there's no way of saying: if no scales
are provided in iterate_over
, inject these default scales.
Unfortunately now I have to go out. I'll merge it tonight after copying the future into iterate.py and converge.py. Ok??
I won't be able to write the documentation until tomorrow, can it wait? :)
Ok. Listen, I'll finish this and merge. You can do the documentation in another branch starting from develop. For the equation of state, rewrite the current equation of state in another PR. Thanks!
:+1:
Reorganization and some improvements of the iterator and convergence workchains.
This PR, for the moment, consists in the creation of new files. However
future_iterate.py
andfuture_converge.py
will substituteiterate.py
andconverge.py
when we agree on the reorganization.The most relevant changes are:
The separation between
BaseIterator
andInputsIterator
have been removed. Now there are two abstract classes:ProcessInputsIterator
andGeneralIterator
. The reasons for this change are several. First of all, we agreed on the fact that there is no other use case forBaseIterator
a part fromInputsIterator
. In fact you always somehow iterate over inputs of aProcess
. Secondly, in the spirit of suggesting to include these classes inaiida_core
, I thought it was good to separate the actual iterator mechanism from the implementation that allows to decide which parameters (to iterate over) are allowed. In other words, there is no strict need to separateProcessInputsIterator
andGeneralIterator
. TheGeneralIterator
alone would be enough for any use case. However the separation helps the documentation of the classes and will facilitate the discussion. I felt, in fact, that the approach of using_parameters_lookup
keyword to decide the allowed parameters should be discussed separately compared to the actual iteration mechanism.The method
process_input_parsing_func
(onceinput_key_and_parse_func
) is now a classmethod. This will provide the users with a tool to check in advance which parameters to iterate over are allowed and how. Moreover it is really useful for the sequential converger as it overcomes the previous need to copy_parameter_lookup
of_process_class
into an attribute of the sequential converger. In general I think theprocess_input_parsing_func
as the central quantity to manage the supported iteration parameters.I (hopefully) improved a bit the management of situations where a user wants to define a
namespace
. This is a very common use case. One thing we have be careful about though. In my implementation, the keys ofiterate_over
that are recognized as process inputs (don't need a parsing function) are specified with no namespace. To clarify, the correct submission is this, where initerate_over
we don't putsiesta.parameter
as key:Implemented some checks on
_process_class
that essentially define the class as abstract. In the new implementation I don't have any abstract method, but only abstract attributes. The way to manage abstract attributes is not clear to me and maybe not even in the community. See here: https://stackoverflow.com/questions/2736255/abstract-attributes-in-python. I just followed what is implemented in aiida_core for theBaseRestartWorkChain
.Changed some names (hope they are more clear), made private all the methods that are not classmethods or in the outline, improved the documentation.
This is my proposal for the implementation of our iterator and converger, @pfebrer96 I look forward for your comments. We can revert to your previous implementation anything that you don't find improved. I also ask your help to clearly document the use of _parameters_lookup (see docstring of
GeneralIterator
). I didn't do much to transform the Converger in an actual plugin. On that I think the AiiDA community can give us a great help. I'm happy I spent time to dig deep in your implementation, this is the most important point. Then, as I said, we can discuss what to bring back to your implementation. In general, if we decide to make a pull request toaiida_core
. We will surely receive some useful comments and help from the community.