richfitz / remake

Make-like declarative workflows in R
Other
340 stars 32 forks source link

flag to always rebuild (or update) a target #95

Open RemkoDuursma opened 8 years ago

RemkoDuursma commented 8 years ago

For datasets from streaming services, it would be useful to have a flag that allows the command for the target to be always run, or alternatively updated by the user, like:

  somedata:
    command: retrieve_data_online()
    check: update

which would allow the call remake::update(), which would run all targets with the update flag. Not sure what sort of flag to suggest for the case where the target should always be rebuilt, perhaps check: none ?

Finally a wish item would be to build a target only every X days (useful for retrieving data from streaming services that update only every X days).

wlandau commented 8 years ago

Neat idea. This behavior here actually happened to me by accident once when I was playing with two different uncollated remake/YAML files, where remake1.yml created external .rds files as input for remake2.yml. I'm glad it was just a toy project, though, because we need to be careful about reproducibility. It is possible that a single instance of remake disallows this by design. The solution may not be an "always run" flag, but instead a way to trigger rebuilds based on changes to remote files on those online services.

RemkoDuursma commented 8 years ago

The solution may not be an "always run" flag, but instead a way to trigger rebuilds based on changes to remote files on those online services.

That sounds reasonable, but we need to first find out whether the remote files have changed. So maybe a more general solution would be to rebuild if some checking function returns TRUE, e.g.

somedata:
  command: retrieve_data_online()
  check: has_remote_changed()

where has_remote_changed is a user-defined function that does the checking. This might be useful, because it opens up possibilities to do my second wish item, for example only rebuild on Tuesdays:

somedata:
  command: retrieve_data_online()
  check: is_it_tuesday()
egouldo commented 7 years ago

I have this feature requirement also, but from the standpoint of working with a package using an API. Unfortunately the links between the R object and the C Pointer in the API are broken every-time the R session is restarted. This means that I need to rebuild the target on every call to remake::make(), even if there are no file/code changes. I've been getting around this issue by renaming targets and forcing the targets to be re-processed, but this is extremely irritating.