bcgov / bcgovr

An R package to automate set up and sharing of R projects in bcgov GitHub following bcgov guidelines
Apache License 2.0
34 stars 5 forks source link

sub-folder structure #10

Closed MonkmanMH closed 7 years ago

MonkmanMH commented 7 years ago

Add sub-folders, as per the guidelines in the recently published paper "Good Enough Practices in Scientific Computing" -- Summary of Practices, 4b - 4e

boshek commented 7 years ago

This is a tough one. I do like the idea of mirroring that paper. Just to clarify are you saying we replace the files created here with the above structure?

The thing I struggle with here is a "one-size-fits-all" approach to file structure within an analysis directory. Would an argument with a few options of folder/file combinations be merited? Or is better to be simpler and if users don't like the default, they just delete them and replace with their own files/folders?

stephhazlitt commented 7 years ago

I lean towards one default & users can manually edit as they like -- I think multiple options might add complexity for a similar outcome (the options don't meet everyone's styles/needs)? For the EnvReportBC work flow, there is a preference for having the core analysis scripts in the root directory (see thread here for a similar discussion).

boshek commented 7 years ago

I think I will wait @ateucher to comment on this but I wonder if a blank slate might best here. So that is no .R files are automatically produced. Then in the documentation we could discuss some analysis structure options? Or perhaps the files themselves are generated by another function?

ateucher commented 7 years ago

I feel like a good solution would be a simple default (as it is now), then allow a user to set an option for an alternate setup. I would see this being supplied as a character vector or files/paths. eg:

c("doc/", "data/", "bin/", "results/", "src/01_load.R", "src/02_clean.R", 
"src/03_analysis.r", "src/04_output.R", "src/runall.R")

This could be an additional argument (dir_struct or something) to analysis_skeleton, but there are already a lot of arguments to that function.

Alternatively (or additionally) a user could set a global option of:

options("bcgovr_dir_struct" = c("doc/", "data/", "bin/", "results/", 
                                "src/01_load.R", "src/02_clean.R", 
                                "src/03_analysis.r", "src/04_output.R", 
                                "src/runall.R"))

Internally the function would look for that option, and if it exists use that structure instead of the default one. A user could set that option in the their .Rprofile so it is available everytime they open R.

Thoughts?

boshek commented 7 years ago

+1 for the .Rprofile option. That is a great idea. So nothing done internally but somewhere (likely here for now so documentation that folks know it is an option. This is a great idea @ateucher as I agree there is a bit of argument overload in analysis_skeleton().

ateucher commented 7 years ago

On second thought I think only setting it via an option is a bad idea, as it's pretty opaque and probably an 'anti-pattern' or 'user-hostile' 😉. I think there should be an argument, which by default checks for that option. If the bcgovr_dir_struct option isn't set and the user doesn't supply a different structure to the argument, then it uses the default structure.

ateucher commented 7 years ago

Ok, this is what I have (in the struct-options branch):

# devtools::install_github("bcgov/bcgovr", ref = "struct-options")
library(bcgovr)

## Default
analysis_skeleton(path = "c:/_dev/bcgovr_test")
#> Creating new analysis in c:/_dev/bcgovr_test
#> Adding folders and files to c:/_dev/bcgovr_test: R/, out/, graphics/, data/, 01_load.R, 02_clean.R, 03_analysis.R, 04_output.R, internal.R, run_all.R
#> Adding file c:/_dev/bcgovr_test/CONTRIBUTING.md
#> Adding file c:/_dev/bcgovr_test/CODE_OF_CONDUCT.md
#> Adding file c:/_dev/bcgovr_test/README.md
#> Adding file c:/_dev/bcgovr_test/README.rmd
#> Adding file c:/_dev/bcgovr_test/bcgovr_test.Rproj
#> Adding file c:/_dev/bcgovr_test/LICENSE
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test/01_load.R
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test/02_clean.R
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test/03_analysis.R
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test/04_output.R
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test/internal.R
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test/run_all.R

## Specify it by argument:
analysis_skeleton(path = "c:/_dev/bcgovr_test2", 
                  dir_struct = c("doc/", "data/", "bin/", "results/", 
                                 "src/01_load.R", "src/02_clean.R", 
                                 "src/03_analysis.R", "src/04_output.R", 
                                 "src/runall.R"))
#> Creating new analysis in c:/_dev/bcgovr_test2
#> Adding folders and files to c:/_dev/bcgovr_test2: doc/, data/, bin/, results/, src/01_load.R, src/02_clean.R, src/03_analysis.R, src/04_output.R, src/runall.R
#> Adding file c:/_dev/bcgovr_test2/CONTRIBUTING.md
#> Adding file c:/_dev/bcgovr_test2/CODE_OF_CONDUCT.md
#> Adding file c:/_dev/bcgovr_test2/README.md
#> Adding file c:/_dev/bcgovr_test2/README.rmd
#> Adding file c:/_dev/bcgovr_test2/bcgovr_test2.Rproj
#> Adding file c:/_dev/bcgovr_test2/LICENSE
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test2/src/01_load.R
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test2/src/02_clean.R
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test2/src/03_analysis.R
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test2/src/04_output.R
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test2/src/runall.R

## Set it via options (could put this in your .Rprofile)
options("bcgovr.dir.struct" = c("my_awesome_functions/", "do_it_all.R"))

analysis_skeleton(path = "c:/_dev/bcgovr_test3")
#> Creating new analysis in c:/_dev/bcgovr_test3
#> Adding folders and files to c:/_dev/bcgovr_test3: my_awesome_functions/, do_it_all.R
#> Adding file c:/_dev/bcgovr_test3/CONTRIBUTING.md
#> Adding file c:/_dev/bcgovr_test3/CODE_OF_CONDUCT.md
#> Adding file c:/_dev/bcgovr_test3/README.md
#> Adding file c:/_dev/bcgovr_test3/README.rmd
#> Adding file c:/_dev/bcgovr_test3/bcgovr_test3.Rproj
#> Adding file c:/_dev/bcgovr_test3/LICENSE
#> Adding Apache boilerplate header to the top of c:/_dev/bcgovr_test3/do_it_all.R