ropensci / drake

An R-focused pipeline toolkit for reproducibility and high-performance computing
https://docs.ropensci.org/drake
GNU General Public License v3.0
1.34k stars 128 forks source link

Simplify verbosity #787

Closed wlandau closed 5 years ago

wlandau commented 5 years ago

drake_config() still has a lot of overhead for workflows with several thousands targets. I intend to keep improving the speed, but in the short term, progress bars would also be nice. I think we can get away with utils::txtProgressBar(char = ".", style = 3) and activate it for the highest verbosity setting.

wlandau commented 5 years ago

We should probably use https://github.com/r-lib/progress since it uses messages instead of plain stdout(). Keeping this package optional will be a little tricky, but I think it is doable.

wlandau commented 5 years ago

The whole drake console infrastructure probably needs refactoring too. If we can combine progress bars with parallel computing and make it all amenable to saved log files, it would be a worthwhile improvement.

wlandau commented 5 years ago

Question for onlookers: how would you feel about progress bars as a replacement for most of the messages from make()? With https://github.com/r-lib/progress, I think we could do away with the current console logging system and its cumbersome wall of target-level messages. This could really clean up the internals too. The current system scales poorly for lots of targets.

Screenshot_20190324_230219

bpbond commented 5 years ago

I think for lots of cases a progress bar would be much nicer than a long list. For a short set of targets, though, it could obscure more than help; it would be nice to always have the option of a printed target list.

wlandau commented 5 years ago

Hmm... retaining the current list of targets would make it easier to show stuff in presentations too, and it would allow us to more easily make https://github.com/r-lib/progress optional rather than a strict dependency. We could either allow different verbosity levels like we do now or we could make a decision based on window height. Either way, I am starting to think that the list of targets should not be more complicated than above.

wlandau commented 5 years ago

What about a simple blinker? Much of the preprocessing happens in parallel, and parallel progress bars are hard. Without a progress bar, we can still show evidence that drake is working and show how fast it is iterating over the thousands of code fragments in the plan.

# Unicode braille patterns would probably be nicer
random_chars <- function(n) {
  vapply(
    as.raw(floor(runif(n, min = 33.01, max = 126.99))),
    rawToChar,
    FUN.VALUE = character(1)
  )
}

blink <- function(expr, rate = 0.1) {
  if (runif(1) < 0.1) {
    chars <- random_chars(2)
    message("\r[", chars[1], "]", appendLF = FALSE)
    on.exit(message("\r[", chars[2], "]", appendLF = FALSE))
  }
  force(expr)
}

tmp <- parallel::mclapply(seq_len(1e3), function(x) {
  blink(Sys.sleep(0.01), "blinker")
})
wlandau commented 5 years ago

...which leads pretty quickly to transient messages that display targets and imports...

wlandau commented 5 years ago

...and that makes me want to go super low tech and simple

verbosity messages
0 nothing
1 targets, retries, and failures
2 + storage and times
3 + as much preprocessing detail as possible
wlandau commented 5 years ago

Better idea: dump everything to the console log file regardless of verbosity and simplify the verbose argument even more.

---|--- 0 | nothing 1 | targets, retries, and failures 2 | + spinner from the cli package

wlandau commented 5 years ago

Fixed in #808