xarray-contrib / xarray-simlab

Xarray extension and framework for computer model simulations
http://xarray-simlab.readthedocs.io
BSD 3-Clause "New" or "Revised" License
73 stars 9 forks source link

WIP: Callback system and progress bar #93

Closed rlange2 closed 4 years ago

rlange2 commented 4 years ago

The idea behind this PR is to establish a callback systems that allows using callback mechanisms, e.g. for a progress bar, runtime diagnostics... For now, this is a work in progress as it is not functioning at the moment. I would rather use this PR to stepwise implement this feature. However, I hope the code gives a good impression of what it is trying to accomplish. The basic concept is implemented in drivers.BaseSimulationDriver. self._start, self._prestep and self._stop (_finish might be more fitting) should later be used to get information about the current run (e.g. in the class ProgressBar but generally any callback mechanism that will be implemented). I think my biggest issue here is so 'link' these instance variables to any simulation that is running. This becomes clear when looking at drivers.ProgressBar. Currently, they don't serve any real purpose and I'm not even sure they can get the needed information since all of that is set up in drivers.XarraySimulationDriver. I'm also not sure how detailed the individual stages are defined since any run follows the following concept:

I don't intent to change the overall modelling logic so instead of having information about what stage is running, finished or waiting to run (as what I was trying to do with the steps dictionary in _update_bar()), we might also use step-data from ds_gby_steps(returned by _get_runtime_datasets() - though, these might not be helpful regarding initialize and finalize). To summarise, I think what I'm missing is a data type that I can use to retrieve information. I can imagine that it is already there and I'm simply not aware of.

Documentation is missing for now since I expect that this routine will change quite a lot in the future.

Lastly, some callables might be better suited to xsimlab.drivers but I'm afraid that would lead to circular imports.

This might take me some time, so I welcome any proposals and criticisms :)

benbovy commented 4 years ago

Thanks @rlange for opening this.

Some comments:

def event(runtime_context, store):
    # ...

runtime_context and store should both be instances of some FrozenMapping class that needs to be implemented. Here, "frozen" basically means "read-only", as changing runtime data or simulation data from the callbacks is a bad idea. This link is a good basis for implementing a FrozenMapping wrapper around a mutable mapping.

benbovy commented 4 years ago

Compared to Dask's tasks, in xarray-simlab we might end-up with many possible events resulting from a combination of

There is thus 16 possible events. With some imagination I'm sure we can find relevant use cases for every one of them. That's certainly too much to hard code every event in the class Callback.

Alternatively, the signature for directly creating a Callback instance from callables may look like:

def func1(runtime, store):
    # ...

def func2(runtime, store):
    # ...

clb_funcs = {
    ('initialize', 'model', 'pre'): func1,
    ('run_step', 'process', 'post'): func2
}

clb = Callback(funcs=clb_funcs)

So events are specified using a ('stage', 'model_vs_process', 'pre_vs_post') tuple.

Subclassing Callback may look like (using a decorator):

class MyCallback(Callback):

    @event('initialize', 'model', 'pre')
    def meth1(self, runtime, store):
        # ...

    @event('run_step', 'process', 'post')
    def meth2(self, runtime, store):
        # ...
benbovy commented 4 years ago

If we use a decorator to specify events, we could even support:

@event('initialize', 'model', 'pre')
def func1(runtime, store):
    # ...

@event('run_step', 'process', 'post')
def func2(runtime, store):
    # ...

clb = Callback(funcs=[func1, func2])