stan-dev / cmdstanr

CmdStanR: the R interface to CmdStan
https://mc-stan.org/cmdstanr/
Other
143 stars 63 forks source link

Progress bar #18

Open jgabry opened 5 years ago

jgabry commented 5 years ago

CmdStanPy has a nice progress bar that we can implement in R too.

We are currently using processx::run(..., echo=TRUE) so that the R user sees the output directly from CmdStan, but processx provides a lot more control than that and it looks like it has what we’d need behind the scenes for a progress bar even for chains in parallel.

jgabry commented 5 years ago

After discussing with @mitzimorris I found out that maybe there are some problems with the progress bar implementation inn CmdStanPy so maybe we shouldn't try to copy that until it's sorted out there.

Also, an alternative to a progress bar would be to run background processes and provide a way for the user to check the status. A progress bar isn't necessary for models that run fast, and maybe for slow models it's better to let them just run in the background and allow the user to check the status when they want? This is possible with the processx package we're already using.

Anyway, I haven't thought too hard about this yet so I'm definitely open to other ideas!

rok-cesnovar commented 4 years ago

So a progress bar would be fairly simple to implement after #61 is merged, provided that a user doesnt specify refresh=0. @jgabry do we still want to do this?

jgabry commented 4 years ago

I could go either way on this. The current output from parallelization is fine, although a progress bar could be nicer. What do you think?

rok-cesnovar commented 4 years ago

Do you have an example for an R package that uses progress bars? I am not completely sure how it would look. Worth a try tho.

jgabry commented 4 years ago

Yeah it's worth a try. Things to check out whenever you want to dive into this (definitely not super urgent though):

There are other packages for making progress bars but I haven't used them. I've heard it can be difficult to do progress bars for parallelized code in R, but I haven't gone down that rabbit hole so I don't know any details about what the challenges are unfortunately.

mike-lawrence commented 4 years ago

What do folks think of the approach taken by my package ezStan where I simply watch the contents of the CSV files to establish progress (inc. reporting post-warmup divergences and permitting termination on detection of these)?

rok-cesnovar commented 4 years ago

@mike-lawrence that sounds like a great idea.

So you essentially count the number of lines in the CSV? Do you read the entire CSV in a separate thread constantly for checking divergences?

mike-lawrence commented 4 years ago

So you essentially count the number of lines in the CSV? Do you read the entire CSV in a separate thread constantly for checking divergences?

Precisely. My package is a bit of a monstrosity of poorly-commented/hack-y code, so if this kind of progress indicator is of interest for cmdstanr I'd probably start from scratch. But it would be neat to finally contribute to the project beyond my forum activity, so I'm keen to do it.

rok-cesnovar commented 4 years ago

Cool! I am definitely in favor.

Another option that would only allow showing progress (not divergences and other diagnostic realtime stuff) is to rely on stdout. If the progress bar is enabled then set refresh = 1. That means that we get information for each iteration.

If a user wants realtime reports for divergence/treedepth then we need to read the CSV.

mike-lawrence commented 4 years ago

Good idea, not monitoring the csv if the user hasn't asked for any csv-content-dependent behaviour will make things more resource-friendly during sampling. I'll have to teach myself how to monitor the stdout instead, but I suspect that's within my capacity.

rok-cesnovar commented 4 years ago

For sampling we have a process function with a "state machine" with which we don't do much, mostly just limit that the initial Cmdstan metadata dump is not printed and to mark the transition from warmup to sampling and such: https://github.com/stan-dev/cmdstanr/blob/master/R/run.R#L650 This is called every 200ms I believe if any stdout came in.

For all the other methods the function is here https://github.com/stan-dev/cmdstanr/blob/master/R/run.R#L791 Here we only make sure the Cmdstan metadata dump is not printed.

jgabry commented 4 years ago

But it would be neat to finally contribute to the project beyond my forum activity, so I'm keen to do it.

Sounds great!

mike-lawrence commented 4 years ago

Just a heads up that I'll start working on this today. I'd somehow gotten it into my head that I was waiting on some info but see that all I need is in Rok's last comment. I'll start with the simple progress bar that watches stdout, then once that's working solidly dive into the csv-informed monitor.

Tangentially, now that I've seen that the ArViz crew is using HDF5/NetCDF, I'm more confident that that's at least not a completely ridiculous thing to work on as a replacement for csv in cmdstan, so I may look at that in parallel to the monitor (which would benefit something that's not csv).

rok-cesnovar commented 4 years ago

Great, let me know if you need any additional info or help. Thanks!

mike-lawrence commented 4 years ago

Just an update on this that finally dove into it this weekend and after getting up to speed on R6 I have something partially working here. I'll do a pull request when it's actually ready. I ended up not being able to use any of the existing progress bar packages because none permit multiple simultaneous progress bars, and after getting my own multi-line updates seemingly-working I discovered why. If I don't find a workaround for that snag, I'll implement a minimal all-chains-on-one-line output that will be shown when multi-line isn't going to work.

mike-lawrence commented 4 years ago

Getting closer. Here's what I was hoping to get working in terms of multi-line indicators: ezgif-6-0a265010915b (n.b. the top bar indicates warmup vs sampling periods and the ? during warmup indicates large uncertainty in the time remaining thanks to warmup times not really being very predictive of sampling times; the eta computation during sampling uses only the times during sampling, so have much less uncertainty and therefore no ?)

But it turns out that:

  1. Terminals (at least the ones I've tried) don't seem to respect the '\r' character when text has wrapped onto new lines, yielding '\n' behaviour instead. This leads to content filling up the window. A hack is to fill the screen with new lines before showing progress, so that to the user there's a blank screen with progress at the bottom, but I suspect this will not be acceptable to those that rely on scrolling up to see recent work.
  2. Terminals (at least the ones I've tried) don't seem to consistently set options('width'); workaround on unix-alikes is to use tput cols, and while this value updates if you resize the window, it doesn't update if you resize the window while mod$sample() is running. This latter isn't a big deal, but might cause confusion for users if they expect things to resize nicely at all times.
  3. RStudio's console does update options('width') even while mod$sample() is running, but seemingly with very fuzzy logic such that resizing the window just a little bit can change the output wrapping behaivour but not the value of options('width'). Again not a huge deal.
  4. RStudio's console's font rendering is pretty shoddy, making for less pretty output there than in the gif above.

While I've put a lot of work into the multi-line solution (even using the unicode partial-block characters for finer-than-single-character progress indication), given (1) above I should probably just implement a single-line progress along the lines of: [1: 85% 5m] [2: 87% 4m] [3: 83% 6m] [4: 85% 4m]

Maybe I can have a check for if there is enough room to add progress bars for each process.

Another idea is to not use the console/terminal at all and instead pop out a tcltk progress window or something?

mike-lawrence commented 4 years ago

Actually, now that I think of it, I should check how cmdstanpy does their progress bars to see if they encountered and worked around the blocks I'm experiencing. @mitzimorris whereabouts would I look for the progress bar code in the cmdstanpy source?

Edit: nevermind, I see that cmdstanpy uses tqdm. At first glance tqdm seems to have similar issues, but I'll dig a bit more.

mike-lawrence commented 4 years ago

Oh! I just discovered that the terminal emulators that don't respect the \r character do respect the \033[F character! I'll still add a one-line option, but psyched that I don't have to abandon the nicer multi-line format.

rok-cesnovar commented 4 years ago

Nice!

mike-lawrence commented 3 years ago

Belatedly getting back to this. The in-console progress is pretty much done, and I'll do a PR for that shortly, but I thought I'd get folks thoughts here (I didn't feel it necessary to create a whole new issue for this) on an idea for a subsequent PR:

Given the likely substantial proportion of the user base that is using Rstudio, I wonder if it makes sense to work on an option to use the Rstudio "jobs" feature where each chain would go in it's own job. Frankly, if I'd thought of this earlier, I wouldn't have bothered with the console progress bars as the jobs system has all sorts of nifty progress stuff built in (bars, status indicators, etc.), making for lighter code on cmdstanr's side. Using the jobs system also has the side-benefit of not blocking the rsession, permitting users to do other stuff while the chains are churning.

jgabry commented 3 years ago

I like the idea of using the jobs system, but I'm not super familiar with it. From what you describe it sounds great. Any downsides you can think of?

mike-lawrence commented 3 years ago

Really simply that it's a feature that is only useful to those users using Rstudio. I'll go ahead and finish the in-console progress indicators, then tackle the jobs idea separately.

mike-lawrence commented 3 years ago

Oh! I just realized a pretty important downside to using rstudio jobs; there would either be only one progress bar or potentially lots of overhead associated with multiple rsesions each watching for output from independent runs. Do we have a sense as to how much compute time mod$sample() consumes from watching the output of processx::run()? Is it a "output triggers computation" thing, or does it have to constantly check in for new output?

jgabry commented 3 years ago

Yeah we do need to check regularly for new output. Inside a while loop we have

https://github.com/stan-dev/cmdstanr/blob/962e7c1dcd2b779f75d36ea2ed35f5e2fcc639b5/R/run.R#L274-L275

where procs$poll calls processx::poll (https://processx.r-lib.org/reference/poll.html)

mike-lawrence commented 3 years ago

So I'm reconsidering this consequent to @mitzimorris' reasonable reminders that it is valuable to keep cmdstanr light. That is, since the approach to progress bars I've been implementing simply looks at output that is otherwise already printed, it would be possible for me to move said bars into an entirely different package. A con to doing the bars in a separate package is that then there would be some extra overhead from two levels of checking the cmdstan outputs rather than one. Thoughts?

jgabry commented 3 years ago

Yeah good point, I guess that could certainly go in a separate package that cmdstanr can suggest. That would be fine by me.

A con to doing the bars in a separate package is that then there would be some extra overhead from two levels of checking the cmdstan outputs rather than one.

If it's possible to evaluate and document what the cost roughly is (which should be doable) then I think it's OK. For most users it would probably still be worth using the progress bar even with a bit of extra overhead, but if the extra little bit of speed really matters for someone then they can just not use the progress bar.

mike-lawrence commented 3 years ago

After thinking on this, it'll also be frankly easier on my end in a few respects too. What I think I'll do is revamp my ezStan package to use cmdstanr rather than rstan. I'll post back here when the conversion is done and folks can start using it.

jgabry commented 3 years ago

Cool, thanks Mike!