parallel progress bars - Githubissues

goldingn commented 6 years ago

Getting progress bars and other information in the console for big jobs running in parallel is something I've wanted for a looong time. It is possible to get GUI progress bars on windows (using TK), but this method apparently doesn't work on mac/linux, and doesn't print to the console.

It would be awesome if this functionality could be integrated with the future package, so that it can be used on any parallel backed the future API supports. It would be super awesome if we could enable export of the widely used progress bars in utils, and the swanky progress bars in progress.

There are technical hurdles around with communication between processes and differences between operating systems, but it's definitely achievable. I've put together a gist¹ with a prototype that does this the dumb (but generalisable) way; writing progress information to tempfiles which are read by the main process:

library (future)
source("https://gist.githubusercontent.com/goldingn/d5a3aebfbc63eaadd92f0ff5ca811a5d/raw/12b552722020626e3f7014e1d9314266287acee0/parallel_progress.R")

foo <- function (n) {
  for (i in seq_len(n)) {
    update_parallel_progress(i, n)
    Sys.sleep(runif(1))
  }
  "success!"
}

plan(multiprocess)
future_replicate(4, foo(30))

parallel_progress

There are various ways this could be improved:

Printing progress bars rather than just a percentage process (preferably just embedding bars from the utils and progress packages).
Sending progress information from processes running on another file system (e.g. remote servers²)
Handling more processes than threads
Handling sequential execution
Proper integration with the future package

Related discussions:

Re. progress information in future in which @HenrikBengtsson says he'd rather it were a separate package, and suggests using processx.

Re. multiple progress bars in progress - having progress bars on separate lines isn't trivial since not all consoles allow overwriting of more than one line of output.

Heads up to @HenrikBengtsson and @gaborcsardi, in case they know of progress on this topic that I'm not aware of!

¹ https://gist.github.com/goldingn/d5a3aebfbc63eaadd92f0ff5ca811a5d ² my main motivation for this is getting live console progress bars for greta jobs running on Google CloudML

goldingn commented 6 years ago

See also the future progress bar in furrr https://twitter.com/dvaughan32/status/985471691852742656

mpadge commented 6 years ago

This is a really important issue for me as well, and I've implemented the basis of the RcppParallel necessities too. This is also by "dumb" file dump, after spending several days poring through loads of TBB code and finding no better idea. A merge of both R and C++ approaches would likely be really helpful. An R function for C++ code injection should also be possible because it's just appended to the end of an RcppParallel::worker::operator call.

goldingn commented 6 years ago

Awesome! Yeah, it would be great if we could handle both R & C++ in the same package.

goldingn commented 6 years ago

The greta.live reference above relates to producing live-streaming plots of an ongoing analysis (MCMC in this case, but it could be anything). That would require very similar functionality, in having a master process providing information to the users, from log files written by the subprocesses.

Writing this parallel progress bar code in a general way would make similar problems like that easily achievable.

goldingn commented 6 years ago

Also, we could think about supporting progress bars for jobs running on remote machines (e.g. using future's plan(remote)) with the awesome looking rOpenSci ssh package

jcheng5 commented 6 years ago

For same machine scenarios, the file dump works and may make the most sense for the prototyping phase. But in case you're not aware of these, here are the usual suspects for message passing between different threads/processes:

Threads: queue + mutex + condition variable (need to get the latter two from a cross platform threading library, or require C++11). This would work well for RcppParallel and shouldn't be much work. Process on POSIX: pipe() Process on Windows: anonymous pipe

If you're not familiar with these and want to learn more I'm happy to elaborate in person next week. This project would really help Shiny's new integration with future/promises (though for Shiny we'd want callbacks in the main process instead of console messages, so we can send the progress to the browser). As would communicating cancellation requests from the master to the workees.

HenrikBengtsson commented 6 years ago

This is an excellent and important proposal. Here are some of my thoughts:

I think of progress bars as a special type of non-critical information that is communicated between workers and master. When implementing a framework for this, it is important to consider a few things. For instance,

some workers run on the (i) same machine, on (ii) machines on the local network, and some on (iii) external remote machines. It could even be that we run on a mix of these.
communication may be done directly via pipes/connections, shared file systems, via databases, distributed message-passing frameworks (e.g. ZeroMQ), third-party services (e.g. Pushbullet, ...), etc. Zero, one, or more of these communication strategies may be supported on the end user's system - as developers we should never make hard assumptions on which ones are supported.
for some workers, we may not be able to retrieve any information until they terminate.
communication may time out / fail, which is ok and should be accounted for. It is important that such failures do neither block nor break the parallel processing.
file updates on shared file systems are often delayed, e.g. it can take 5-30 seconds for a file update to be seen by other machines.
the software may run in multi-user, multi-process environments, which may result in port clashes (a port is already be occupied by another process).

Since progress information is non-critical, it should also be optional, which the progress API should reflect/support. In all cases, we have information about when a "task" begins (= 0% progress) and when it finished (= 100% progress). Any additional updates on progress in-between 0% and 100% are optional and receiving them should be considered a "bonus". I think this helps to have in mind when designing a progress framework. If we require that progress information to be available everywhere, or to always be querible, we will also limit what type of parallel/distributed backends we can use.

I believe that a progress API should be defined in a standalone package. This package could implement some basic/common communication strategies, whereas other strategies may be implemented on top of this in other packages/backends.

wlandau commented 6 years ago

I am interested in the message queue piece specifically, so I am eager to see how this project turns out. https://github.com/r-lib/liteq seems like a portable solution, and https://github.com/eddelbuettel/prrd already uses it, but I am having trouble applying it (ref: https://github.com/r-lib/liteq/issues/17). Maybe something like rzmq will turn out to be more thread safe.

goldingn commented 6 years ago

Awesome awesomeness.

Thanks @jcheng5, I know nothing about this stuff so that was really helpful. I'd definitely be keen to spend a couple of days digging into the details!

@HenrikBengtsson good point re getting the level of generalisation of the framework right.

I think we could get a pure-R prototype with file dumps working pretty quickly (I did some more fiddling on the plane over, just a little polish required). So starting to build up a solid and generalisable framework for the backend might be quite feasible.

goldingn commented 6 years ago

gist with prototype of displaying multiple (fully customisable) progress::progress_bar bars on the same line of the terminal: https://gist.github.com/goldingn/e6521f4ed0fa1b6566b754950caaf518

goldingn commented 6 years ago

For anyone keeping an eye on this topic, we were working in this repo

Rather than incrementally build on the simple examples in this discussion, we decided to go for a more ambitious general framework. We nearly got it working during the unconf, though it's currently a brick. The architecture and core functionality is in there, it just needs a little careful plumbing, and some debugging to get it working. I'll try to get it to that state in the next week or two.

From that working prototype phase there there are loads of ways it could be tidied up, made more usable and adapted to use different backends. If anyone is keen to get involved, please let us know over on the project repo.

ropensci / unconf18

parallel progress bars #23