Open jgabry opened 5 years ago
After discussing with @mitzimorris I found out that maybe there are some problems with the progress bar implementation inn CmdStanPy so maybe we shouldn't try to copy that until it's sorted out there.
Also, an alternative to a progress bar would be to run background processes and provide a way for the user to check the status. A progress bar isn't necessary for models that run fast, and maybe for slow models it's better to let them just run in the background and allow the user to check the status when they want? This is possible with the processx package we're already using.
Anyway, I haven't thought too hard about this yet so I'm definitely open to other ideas!
So a progress bar would be fairly simple to implement after #61 is merged, provided that a user doesnt specify refresh=0. @jgabry do we still want to do this?
I could go either way on this. The current output from parallelization is fine, although a progress bar could be nicer. What do you think?
Do you have an example for an R package that uses progress bars? I am not completely sure how it would look. Worth a try tho.
Yeah it's worth a try. Things to check out whenever you want to dive into this (definitely not super urgent though):
progress
package (https://github.com/r-lib/progress, https://cran.r-project.org/package=progress). Has some of same very capable authors as the processx
package that we're using. revdepcheck
package (https://github.com/r-lib/revdepcheck, for testing reverse dependencies) and that has a really nice progress bar. Looks like it uses the progress
package.progress
by looking at it's CRAN page: https://cran.r-project.org/package=progress (scroll down to reverse dependencies section). There are a bunch so you may be able to look at those packages for implementation details. There are other packages for making progress bars but I haven't used them. I've heard it can be difficult to do progress bars for parallelized code in R, but I haven't gone down that rabbit hole so I don't know any details about what the challenges are unfortunately.
What do folks think of the approach taken by my package ezStan where I simply watch the contents of the CSV files to establish progress (inc. reporting post-warmup divergences and permitting termination on detection of these)?
@mike-lawrence that sounds like a great idea.
So you essentially count the number of lines in the CSV? Do you read the entire CSV in a separate thread constantly for checking divergences?
So you essentially count the number of lines in the CSV? Do you read the entire CSV in a separate thread constantly for checking divergences?
Precisely. My package is a bit of a monstrosity of poorly-commented/hack-y code, so if this kind of progress indicator is of interest for cmdstanr I'd probably start from scratch. But it would be neat to finally contribute to the project beyond my forum activity, so I'm keen to do it.
Cool! I am definitely in favor.
Another option that would only allow showing progress (not divergences and other diagnostic realtime stuff) is to rely on stdout. If the progress bar is enabled then set refresh = 1. That means that we get information for each iteration.
If a user wants realtime reports for divergence/treedepth then we need to read the CSV.
Good idea, not monitoring the csv if the user hasn't asked for any csv-content-dependent behaviour will make things more resource-friendly during sampling. I'll have to teach myself how to monitor the stdout instead, but I suspect that's within my capacity.
For sampling we have a process function with a "state machine" with which we don't do much, mostly just limit that the initial Cmdstan metadata dump is not printed and to mark the transition from warmup to sampling and such: https://github.com/stan-dev/cmdstanr/blob/master/R/run.R#L650 This is called every 200ms I believe if any stdout came in.
For all the other methods the function is here https://github.com/stan-dev/cmdstanr/blob/master/R/run.R#L791 Here we only make sure the Cmdstan metadata dump is not printed.
But it would be neat to finally contribute to the project beyond my forum activity, so I'm keen to do it.
Sounds great!
Just a heads up that I'll start working on this today. I'd somehow gotten it into my head that I was waiting on some info but see that all I need is in Rok's last comment. I'll start with the simple progress bar that watches stdout, then once that's working solidly dive into the csv-informed monitor.
Tangentially, now that I've seen that the ArViz crew is using HDF5/NetCDF, I'm more confident that that's at least not a completely ridiculous thing to work on as a replacement for csv in cmdstan, so I may look at that in parallel to the monitor (which would benefit something that's not csv).
Great, let me know if you need any additional info or help. Thanks!
Just an update on this that finally dove into it this weekend and after getting up to speed on R6 I have something partially working here. I'll do a pull request when it's actually ready. I ended up not being able to use any of the existing progress bar packages because none permit multiple simultaneous progress bars, and after getting my own multi-line updates seemingly-working I discovered why. If I don't find a workaround for that snag, I'll implement a minimal all-chains-on-one-line output that will be shown when multi-line isn't going to work.
Getting closer. Here's what I was hoping to get working in terms of multi-line indicators:
(n.b. the top bar indicates warmup vs sampling periods and the ?
during warmup indicates large uncertainty in the time remaining thanks to warmup times not really being very predictive of sampling times; the eta computation during sampling uses only the times during sampling, so have much less uncertainty and therefore no ?
)
But it turns out that:
options('width')
; workaround on unix-alikes is to use tput cols
, and while this value updates if you resize the window, it doesn't update if you resize the window while mod$sample()
is running. This latter isn't a big deal, but might cause confusion for users if they expect things to resize nicely at all times.options('width')
even while mod$sample()
is running, but seemingly with very fuzzy logic such that resizing the window just a little bit can change the output wrapping behaivour but not the value of options('width')
. Again not a huge deal.While I've put a lot of work into the multi-line solution (even using the unicode partial-block characters for finer-than-single-character progress indication), given (1) above I should probably just implement a single-line progress along the lines of:
[1: 85% 5m] [2: 87% 4m] [3: 83% 6m] [4: 85% 4m]
Maybe I can have a check for if there is enough room to add progress bars for each process.
Another idea is to not use the console/terminal at all and instead pop out a tcltk progress window or something?
Actually, now that I think of it, I should check how cmdstanpy does their progress bars to see if they encountered and worked around the blocks I'm experiencing. @mitzimorris whereabouts would I look for the progress bar code in the cmdstanpy source?
Edit: nevermind, I see that cmdstanpy uses tqdm. At first glance tqdm seems to have similar issues, but I'll dig a bit more.
Oh! I just discovered that the terminal emulators that don't respect the \r
character do respect the \033[F
character! I'll still add a one-line option, but psyched that I don't have to abandon the nicer multi-line format.
Nice!
Belatedly getting back to this. The in-console progress is pretty much done, and I'll do a PR for that shortly, but I thought I'd get folks thoughts here (I didn't feel it necessary to create a whole new issue for this) on an idea for a subsequent PR:
Given the likely substantial proportion of the user base that is using Rstudio, I wonder if it makes sense to work on an option to use the Rstudio "jobs" feature where each chain would go in it's own job. Frankly, if I'd thought of this earlier, I wouldn't have bothered with the console progress bars as the jobs system has all sorts of nifty progress stuff built in (bars, status indicators, etc.), making for lighter code on cmdstanr's side. Using the jobs system also has the side-benefit of not blocking the rsession, permitting users to do other stuff while the chains are churning.
I like the idea of using the jobs system, but I'm not super familiar with it. From what you describe it sounds great. Any downsides you can think of?
Really simply that it's a feature that is only useful to those users using Rstudio. I'll go ahead and finish the in-console progress indicators, then tackle the jobs idea separately.
Oh! I just realized a pretty important downside to using rstudio jobs; there would either be only one progress bar or potentially lots of overhead associated with multiple rsesions each watching for output from independent runs. Do we have a sense as to how much compute time mod$sample()
consumes from watching the output of processx::run()
? Is it a "output triggers computation" thing, or does it have to constantly check in for new output?
Yeah we do need to check regularly for new output. Inside a while loop we have
https://github.com/stan-dev/cmdstanr/blob/962e7c1dcd2b779f75d36ea2ed35f5e2fcc639b5/R/run.R#L274-L275
where procs$poll
calls processx::poll
(https://processx.r-lib.org/reference/poll.html)
So I'm reconsidering this consequent to @mitzimorris' reasonable reminders that it is valuable to keep cmdstanr light. That is, since the approach to progress bars I've been implementing simply looks at output that is otherwise already printed, it would be possible for me to move said bars into an entirely different package. A con to doing the bars in a separate package is that then there would be some extra overhead from two levels of checking the cmdstan outputs rather than one. Thoughts?
Yeah good point, I guess that could certainly go in a separate package that cmdstanr can suggest. That would be fine by me.
A con to doing the bars in a separate package is that then there would be some extra overhead from two levels of checking the cmdstan outputs rather than one.
If it's possible to evaluate and document what the cost roughly is (which should be doable) then I think it's OK. For most users it would probably still be worth using the progress bar even with a bit of extra overhead, but if the extra little bit of speed really matters for someone then they can just not use the progress bar.
After thinking on this, it'll also be frankly easier on my end in a few respects too. What I think I'll do is revamp my ezStan package to use cmdstanr rather than rstan. I'll post back here when the conversion is done and folks can start using it.
Cool, thanks Mike!
CmdStanPy has a nice progress bar that we can implement in R too.
We are currently using
processx::run(..., echo=TRUE)
so that the R user sees the output directly from CmdStan, but processx provides a lot more control than that and it looks like it has what we’d need behind the scenes for a progress bar even for chains in parallel.