wlandau / drake.hasty

Hasty mode for the drake R package.
0 stars 0 forks source link

Direction for development #1

Open wlandau opened 5 years ago

wlandau commented 5 years ago

Right now, drake.hasty is an experiment in minimalism. What should be the driving purpose of development? Some possible directions:

A sandbox

Breeding/testing ground for new drake backends.

An independent front-end scheduler for users

As a scheduler, drake.hasty would support production-ready workflows that either

  1. Do not need drake's reproducibility features, or
  2. Would suffer egregious overhead with the current version of drake.

Related:

A backend for drake official persistent workers

It has been suggested that maybe drake could directly call drake.hasty for its scheduling needs. I think the idea was to lighten the code base in the same way devtools offloaded to usethis, remotes, etc. However, the more I downsize and reorganize drake, the more it seems like this shift might not be worth it.

  1. drake's code for scheduling is actually very light and simple when compared with the rest of the internals. Offloading the scheduling may not accomplish much.
  2. It is difficult to disentangle drake's internals from its scheduling operations. drake makes decisions about checking, building, memory management, etc. using data structures not available to drake.hasty. Many of these build operations and decisions happen outside the customizable config$hasty_build function.

These options are not mutually exclusive, and my assessment may change as drake gets smaller and simpler. Definitely a question to keep revisiting long-term.

cc @krlmlr

wlandau commented 5 years ago

I have been spending a lot of time profiling and speeding up drake (see this test case) and I am more convinced that drake.hasty could have its own serious role beyond just as a sandbox. Unsurprisingly, the bottlenecks in drake itself are

  1. Storing outputs.
  2. Analyzing code.
  3. Checking the status of dependencies.

drake.hasty does none of those things on its own.

bottlenecks