carpentries-incubator / targets-workshop

Pre-alpha {targets} workshop
https://carpentries-incubator.github.io/targets-workshop/
Other
33 stars 6 forks source link

Use format = "file" instead of tar_file #30

Open multimeric opened 2 months ago

multimeric commented 2 months ago

Happy to discuss the merits of this. Basically I've found myself preferring to use tar_target(format = "file") instead of tar_file. My motivations for this are:

github-actions[bot] commented 2 months ago

Thank you!

Thank you for your pull request :smiley:

:robot: This automated message can help you check the rendered files in your submission for clarity. If you have any questions, please feel free to open an issue in {sandpaper}.

If you have files that automatically render output (e.g. R Markdown), then you should check for the following:

Rendered Changes

:mag: Inspect the changes: https://github.com/carpentries-incubator/targets-workshop/compare/md-outputs..md-outputs-PR-30

The following changes were observed in the rendered markdown documents:

 config.yaml (new) |   87 +++
 files.md          |   29 +-
 md5sum.txt        |    4 +-
 organization.md   |    3 +-
 renv.lock (new)   | 1783 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 1888 insertions(+), 18 deletions(-)
What does this mean? If you have source files that require output and figures to be generated (e.g. R Markdown), then it is important to make sure the generated figures and output are reproducible. This output provides a way for you to inspect the output in a diff-friendly manner so that it's easy to see the changes that occur due to new software versions or randomisation.

:stopwatch: Updated at 2024-07-10 05:54:32 +0000

joelnitta commented 1 month ago

Thanks for the PR @multimeric !

I think this is related to a broader discussion of design philosophy of the lesson: should we use syntactic sugar to make the pipeline look nicer (and perhaps more like what a learner is likely to encounter in the wild), or stick to a limited number of more verbose functions to hopefully foster better learner comprehension? I think the best approach may be to use some of both. I freely admit my opinion on how much of each is biased by my personal preferences when writing {targets} pipelines.

For example, I agree that tar_file_read() should be removed (#42), particularly because of the NSE needed to make it work (!!.x).

However, I just taught the workshop myself again recently, and had no problem with tar_plan(). It probably took <10 min to explain, and I think its syntax does make the dependency relationship between targets clearer. IMHO it is worth emphasizing that a {targets} workflow (list() or tar_plan()) should be written as a high-level description of the workflow that can be understood even without knowing the details of {targets}. tar_plan() goes a pretty far way towards that.

tar_file() is somewhat in the middle. I don't think it is asking much of learners since it can be used as a drop-in for tar_target(), but on the other hand, it is also straightforward to add format = "file" to tar_target().

{tarchetypes} is also needed for tar_quarto(), which I definitely think should be kept, so we won't be dropping that dependency. Also, I think it's also good for learners to be aware of {tarchetypes} and what it is used for.