Default parameters for problem classes

brownbaerchen commented 1 year ago

As pySDC is a prototyping code, more often than not we don't care about the precise setup of a problem. Instead, we just want something that we can use to test the order of a scheme of something like that. Doing things like that could be greatly accelerated by providing default parameters for the problems.

I am unclear on what the best approach for this is. My first proposal would be to make a class method called something like get_default_description, which will setup an entire description object such that you can immediately get started. Since there is usually only a unique sweeper class allowed for each problem, it makes sense to include more parameters than just problem parameters.

But what about the quadrature? I guess this is a point where it is difficult to setup a default that makes sense for everybody. I am still in favour of just setting up something that works and then you change it rather than you need to setup everything every time. For instance, I might want to use a fixed number of iterations, while you might not, but we may be able to agree on a type of quadrature nodes.

A different option would be to have the class method ask for a dictionary of quadrature parameters including residual tolerance, step size and iteration count, which are typically spread across different parts of the description object. Of course, I want to leave them where they are, but this could be sorted out automatically in this class method.

I am unsure about whether we need defaults for the controller parameters, because these already have defaults that are pretty problem unspecific. This means you have to setup a hook just if you want to have the solution after the step, but the hook is probably the part that the least number of people will agree on anyways, since we all want to extract different information for our tests. For instance, in other hooks I have seen, people usually store the solution at the time corresponding to the beginning of the step, but I always store it at the end of the step, since the plots look messed up if you use adaptive step sizes otherwise.

So I suggest adding this to the core problem class:

@classmethod
def get_default_description(cls, quadrature_params):
    # check that the quadrature_params contain everything that we need
    required_keys = ['maxiter', ...]
    for k in required_keys:
        assert k in quadrature_params.keys(), f'Need "{k}" in quadrature_params to setup defaults'

    # initialise a dictionary for each of the entries of the description class
    step_params = dict()
    ...

    # allocate the quadrature params to where they belong in the description
    step_params['maxiter'] = quadrature_params['maxiter']
    ...

    return description

Then the individual problem classes would inherit and add their problem parameters on top:

@classmethod
def get_default_description(cls, quadrature_params):
     description = super().get_default_description(quadrature_params)
     description['problem_params']['power_level'] = 9001
     return description

Any problems with this / better ideas welcome!

tlunet commented 1 year ago

I do agree with this idea of explicit default parameters, and I think that we already discussed it with @pancetta during the psSDC hackaton last year.

However, I wouldn't hide this in some class method so people would have to look at the implementation within it to know what are the default parameter values. Always relying on Object Oriented Programming in Python is not always simplifying things, as there is many non OOP approaches that may be more adapted.

For instance, dictionaries :smile:. The way I see things here : every class (base or inherited) is defined in one module, so on the top of the module file (before class definition), there could be a dictionary named DEFAULT that would contains all the default parameters values (and in the same time, all the parameters used to define the class), e.g :

DEFAULT = {
    "par1": [default],  # description
    "par2": [default],  # description
    ...
}  
# PS : the use of upper letter for the variable name follows the conventions of macro, e.g in C++

class Problem(object):
    def __init__(self, params):
        # Two lines to set the parameters
        for key, default in DEFAULT.items():
             setattr(self, key, params.get(key, default))
        # Do other pre-processing stuff, like alocating arrays, generating matrices, etc ...
        # ...

Now a few additional points ...

Not putting every parameters in the same basket

In many classes, some parameters are mandatory (e.g for a time-step size), while some can have a default value. The mandatory parameters should not have a default value in DEFAULT, but hey can have a None value

DEFAULT = {
    "par01": None, # description
    "par02": None, # description
    "par1": [default],  # description
    "par2": [default],  # description
    ...
}  

class Problem(object):
    def __init__(self, params):
        for key, default in DEFAULT.items():
             setattr(self, key, params.get(key, default))  # if default is None, raise an error if not given in params

Genericity VS clarity

Here every parameters are centralized in one specified place, so one just have to look at the content of DEFAULT to see every parameters. The use of the params argument (a dictionary) allows to pass additional parameters that could be used by an other class in the attributes (e.g controller passing argument to a level class, etc ...). But the main drawback of this is that the constructor signature will not show information given in DEFAULT. That's why in Python, one tend to use more positional/named argument, and the **kwargs argument to allows supplementary parameters not associated to the class, e.g :

class Problem(object):
    def __init__(self, par01, par02, par1=[default], par2=[default], **kwargs):
        """ Some doc describing the parameters ..."""
        params = locals(); params.pop('self'), params.pop('kwargs')
        for key, val in params.items():
             setattr(self, key, val)
        # Eventually pass kwargs to other classes ...

Some advantages of this approach :

signature is easily accessible and provide the default values + which parameters are mandatory
description is written nearby and automatically accessible in many IDEs
only need to change the signature to change parameter list, default value, etc ...

:bell: the params dictionary can eventually by registered as an attribute of the class, if one does not want to have each parameter as attribute (could be a frozen dictionary too, if one look for privacy, even if this is not really the main philosophy in Python, and I do believe it makes pySDC more complex ... :wink:)

And what about the children

Using the DEFAULT approach is compatible with inheritance, using the update method of dictionaries, e.g in a module implementing an inherited class :

from pySDC.core.Problem import Problem, DEFAULT as DEFAULT_BASE

DEFAULT = {
    # Parameters definition, some could also be the same as in DEFAULT_BASE with different default value
}
DEFAULT = DEFAULT_BASE.copy().update(DEFAULT)  # overwrite base default with local one, eventually add children specific parameters

# Define the children class below ...

This also can be done using named/keyword args. approach :

from pySDC.core.Problem import Problem

class MyCoolProblem(Problem):
    def __init__(self, par11, par12, par21=[default], par22=[default], **kwargs):
        """ Some doc describing the parameters ..."""
        params = locals(); params.pop('self'), params.pop('kwargs')
        super().__init__(self, **kwargs, **params)
        for key, val in params.items():
             setattr(self, key, val)
        # Eventually pass kwargs to other classes ...

In this example, parameters specific to MyCoolProblem must be given in kwargs, and are not specified in the signature of the constructor. But if some are common (e.g par22 is actually the same as par2 in Problem, but with a different default value), the since params is also passed in the parent constructor then the children default value will be used. And if a mandatory parameter required by the base Problem class is not given in kwargs, this will let the base class raise the error (also, parameters in the children constructor are those of the children class only, and we don't get a railway of mandatory arguments from the base class that one should rewrite every time we want to define a children class

brownbaerchen commented 1 year ago

I kind of like the approach of putting the problem parameters as arguments in the problem __init__ function. After all, the only extra modification you have to make is in the level class where the problem is instantiated, you have to put ** before the argument problem_params, right? Enabling IDEs to show the possible arguments is definitely quite useful! And with this cool locals() thing that you are doing, this would not lead to too much mess I don't think...

The only annoyance I see is that you can't quite use lists and stuff like that in the function definition, so you have to do that below and you end up not seeing the default in the IDE, but whatever. It's better than nothing.

tlunet commented 1 year ago

The only annoyance I see is that you can't quite use lists and stuff like that in the function definition, so you have to do that below and you end up not seeing the default in the IDE, but whatever. It's better than nothing.

What do you mean by "can't quite use lists and stuff" ? Providing a list for one of the parameter, or providing a list of argument of variable size ?

brownbaerchen commented 1 year ago

I mean using a list as default parameter in the function definition, i.e.

def function(a = [1, 2]):

is forbidden I think. So you have to sth like

def function(a=None):
    a = [1, 2] if a is None else a

or whatever.

tlunet commented 1 year ago

you have to put ** before the argument problem_params, right?

Probably. This ** operator is used to unzip a dictionnary. For instance, if you have a function of the form

def func(a,b,c):
    pass

Then you can pass the parameters like this :

param = {'a': 1, 'b': 2, 'c': 3}  # equivalent to : param = dict(a=1, b=2, c=3)
func(**param)  # => func(a=1,b=2,c=3)

This works the same way for lists or tuples that can be given as positional argument, but using the * operator :

args = (1,2,3)  # or args = [1,2,3]
func(*args)  # func(1,2,3)

Eventually, you can combine both operators, with the only rule that positional argument are always placed before named arguments, for instance :

args = (1,2) 
kwargs = {'c': 3}
func(*args, **kwargs)  # func(1,2,c=3)

As a matter of fact, the most generic signature of a function in python is something like this :

def function(a,b,c,*args, d=1, e=3, **kwargs):
    print(locals())

In this example, the function has at least 3 positional arguments and 2 named argument, with default value for 2 named arguments

tlunet commented 1 year ago

I mean using a list as default parameter in the function definition, i.e.
def function(a = [1, 2]):
is forbidden I think. So you have to sth like
def function(a=None):
    a = [1, 2] if a is None else a
or whatever.

Why forbidden ? This following definition works :

def function(a=[1, 2]):
    pass

You can even have dictionaries, tuples, objects, classes, combinations of those, ... The trick is if you use whether a mutable default (e.g a list) or a non-mutable default (e.g a tuple). If you use a list as default value, then you can modify it within the function call, for instance :

def f1(a=[1,2]):
    a += [0]
    print(a)

f1()  # => [1, 2, 0]
f1()  # => [1, 2, 0, 0]
f1()  # => [1, 2, 0, 0, 0]

because the default value is just registered somewhere in the function hidden attributes, and since it's a mutable object it can be modified similarly as if we where using reference or pointer in C/C++.

But if you consider tuple as default value :

def f1(a=(1,2)):
    a += (0,)
    print(a)

f1()  # => (1, 2, 0)
f1()  # => (1, 2, 0)
f1()  # => (1, 2, 0)

since a is a non-mutable object, then the a += operation just assign a to a new variable local to f1, and the default value is not modified. This would work similarly if we were using list as default value like this :

def f1(a=[1,2]):
    a = a.copy()
    a += [0,]
    print(a)

f1()  # => [1, 2, 0]
f1()  # => [1, 2, 0]
f1()  # => [1, 2, 0]

:scroll: Same idea with dictionaries, sets, classes and objects (mutable) VS frozen set, frozen dictionaries, float, int, ... (non mutable)

tlunet commented 1 year ago

PS : some more details on this argument thing in Python

brownbaerchen commented 1 year ago

I don't know about black and flakeheaven, but flake8 used to prohibit me from having mutable default arguments in functions to protect me from unknowingly running into this behaviour. And since it really is somewhat unintuitive, it should be avoided in my opinion.

tlunet commented 1 year ago

Then using tuple should work. The use of None is also sometimes a recommended approach, but has the drawback of hiding the default value within the function implementation. And the mutable default can also be of use, for instance when storing timing result without using a wrapper ... but should be indeed used carefully and knowingly :sweat_smile:

tlunet commented 1 year ago

Btw, I would suggest writing somewhere a generic template for the base class and the inherited ones, which include how the default are defined, how the parameters are stored, etc ... This could be in a markdown file with some source code in it, and we could do some iterations on it to simplify it as much as possible / assert its genericity / improve it's development potential ... and check if we can apply it to the current version without too much trouble.

Would you try a starting PR for that @brownbaerchen ?

pancetta commented 1 year ago

In case you wonder, I appreciate your initiative and would be glad to see this in action. The dictionary things were there from the very start and easier for me to get going, but I agree that they make live harder than needed. The thing is: I just don't have enough time to rework this..

tlunet commented 1 year ago

Maybe you can leave it to the younger (and eager) generation :wink:. I personally would not mind working time to time on this with @brownbaerchen. But maybe this could be easier if we get developer access (without writing right to master) and work on a v6 branch ? Or do you prefer that we stays on the fork of one of us for this @pancetta ?

pancetta commented 1 year ago

The team now has write access to the repository. :scream:

tlunet commented 1 year ago

Closed as solved by #285

Parallel-in-Time / pySDC