Closed rlange2 closed 4 years ago
Actually, I haven't thought about implementing some progress diagnostics simply as process classes. This is a great idea! It would be nice to show a simple example in the documentation.
That said, I think that it would be nice if we could provide another mechanism alongside for this purpose. Progress bars (and other runtime diagnostics such as profilers, logging, etc.) are rather independent of models, so it would make sense not having to include it explicitly in models.
A clean and common approach would be to first implement a callback system and plug it in driver classes (in drivers.py
). A second step would be then to implement a progress bar.
We might get inspiration from those different sources:
I think we can reuse much of Dask's implementation of callbacks here (see https://github.com/dask/dask/blob/7b4f90f0a708cbb97ebb498c78f7931d82ec99db/dask/callbacks.py), e.g.,
def progress(runtime_context, store):
print(runtime_context['step'])
my_callback = xsimlab.Callback(pre_step=progress, post_step=None)
# option 1
dataset.xsimlab.run(model=model, callbacks=[my_callback])
# option 2
with xsimlab.add_callbacks(my_callback):
out_ds1 = dataset.xsimlab.run(model=model)
out_ds2 = dataset.xsimlab.run(model=model2)
# option 3
my_callback.register()
out_ds1 = dataset.xsimlab.run(model=model)
out_ds2 = dataset.xsimlab.run(model=model2)
my_callback.unregister()
I wrote a
process
that indicates the progress of the simulation.I figured, by means of the
runtime
decorator, it is possible to extract information regarding the number of iterations and the end of the simulation. The output is a simple command-line version of a progress bar. Of course, that means the progress bar only includes the iterative portion of the simulation, namelyrun_step
andfinalize_step
, and does neither includeinitialize
norfinalize
. While it seems acceptable to neglect the former (progress bar won't be displayed beforerun_step
), the very last step of the model process may need additional time that is not considered here. I haven't noticed this in any of the given tutorials, however, currently I'm running a rather large model over a period of one billion years in steps of 20k years (whether that makes sense for the moment remains to be seen). This already takes quite long on my machine (~50 minutes) whilefinalize
takes ~160 seconds. I feel that gives the wrong impression on the actual progress. Might consider to supply another output tofinalize
to make it clearer. Since this is aprocess
, it's up to the user to include it into their model. Unfortunately, I don't understand the internals ofxsimlab
enough (yet) to come up with a proper implementation idea. My educated guess would be to write a separate progress bar class and pass it to classModel
or as a property of classModel
.During all my tests, I haven't noticed the process contributing to overhead.
Edit: For the process, something close to the following might do the trick:
Also, I'm curious if
sys.stdout.write()
would be the preferred method of choice in this context overprint()
.