ropensci / targets

Function-oriented Make-like declarative workflows for R
https://docs.ropensci.org/targets/
Other
946 stars 74 forks source link

Enable/ease/encourage projects using targets to become packages #158

Closed Robinlovelace closed 4 years ago

Robinlovelace commented 4 years ago

Prework

Proposal

This is just a general idea that can be worked-up into an example with reproducible code in a test repo if promising. It arises because of the following observations:

Assuming that this issue hasn't already been addressed (I've not looked that hard into it), I can think of two mutually compatible ways that it could be addressed:

Just an idea, hope it's of use/interest and happy to work it up and contribute if you think it's a goer.

wlandau commented 4 years ago

It's an interesting and a persistent topic. targets is certainly opinionated about a lot of programming practices, particularly the emphasis on functions. And with tar_option_set(envir = getNamespace("yourPackage"), you can set up a pipeline as a package in such a way that targets still watches all your R functions for changes (see also import). However, targets deliberately tries not to nudge you into any specific decisions about how to organize files. Yes, the examples do use an R/ folder for function scripts, but that's minor, and there's no built-in nudge in the actual software. Beyond that, I think it's best to decouple targets from that whole discussion because it gets pretty involved. Example: https://milesmcbain.xyz/posts/an-okay-idea/. The research compendium is a related and more flexible construct: https://github.com/benmarwick/rrtools/.

Robinlovelace commented 4 years ago

Thanks for quick reply. Assumed it had been raised before, will take a look!

Robinlovelace commented 4 years ago

Heads-up @wlandau and anyone else interested in this topic, I've got a proof-of-concept package that uses targets: https://github.com/ITSLeeds/dftTrafficCounts

I'm setting this up because I want to make the functions available and I want to demonstrate how to process lots of data using the functions, but I don't want to process the data every time I rebuild the package. It's my first time using the targets package. First impressions: clean, simple, relatively intuitive.

If you have any feedback on this idea @wlandau or anyone (heads-up @MilesMcBain and @benmarwick - food for thought, apologies for tagging but thought this idea may be of interest given your article on +s/-s of packaging and the rrtools package). I think targets + package development are compatible and could allow the best of both worlds in terms of enabling others to use your code while building data intensive workflows in the same project, but can totally see @wlandau's point above about the importance of being 'workflow agnostic'. Thoughts welcome, would be interested in organising the code in the dftTrafficCounts project in a different way, as long as it supports rebuilding outputs that require processing of GB of data and the amazing modular re-running-when-needed capabilities of targets.

wlandau commented 4 years ago

Interesting, thanks for sharing and for exploring the use case. I have a couple suggestions and will post them as issues.

wlandau commented 4 years ago

Just added new package-tracking functionality with tar_option_set(imports = your_packages) (#239, #241). See https://wlandau.github.io/targets-manual/practice.html#packages for details.

Robinlovelace commented 4 years ago

Great work, many thanks @wlandau !