KentonWhite / ProjectTemplate

A template utility for R projects that provides a skeletal project.
http://projecttemplate.net
GNU General Public License v3.0
623 stars 159 forks source link

New feature: Custom templates #176

Closed connectedblue closed 7 years ago

connectedblue commented 7 years ago

This PR proposes a major new feature into ProjectTemplate which is the ability for the user to have their own custom templates applied on top of the standard structure during create.project() in order to tailor the environment to their own needs.

If the first thing you do after create.project() is to add a .gitignore, twiddle with the library settings in config/globals.dcf or add a knitr template with your favourite report layout, then this feature will be for you. Perhaps your organisation or department has a standard set of analysis flow, reports or directory structure that you need to adhere to. They can be set up once, and everyone can use them consistently across projects.

First of all, you set up a directory on your local system containing just the things you want to add to a standard create.project() layout (maybe a README.md replacement, perhaps a new munge/01-A.R). Whatever you put in this folder will be merged into newly minted projects. You can choose to overwrite ProjectTemplate files, or add new ones alongside.

Then you simply do:

templates("add", "<template name>", "local::/path/to/custom/template/dir")

Each time you now issue the command create.project("<project-name>", "<template-name>") a new ProjectTemplate project is created that also incorporates your template automatically.

You can have as many templates as you want. just type templates() to see what's currently installed. If you have a particular template that you use a lot, you can set it to be the default:

templates("setdefault", <template-number-or-name>)

after which, every time you do create.project("new-proj") that template will be automatically applied, saving you some typing.

There is also the ability to use a template that someone has shared on github - maybe a scripted workflow for a particular problem domain. To do this, you first add it to your template collection:

templates("add", "tm", "github:connectedblue/templates:tm")

This is a template I have created to assist with text data mining and analysis. If you now do create.project("my_tm_project", "tm") then you will have a new project structure specifically tailored to get you started with text mining. If you do this yourself as an example, you'll find a README which explains how to use this particular template to help analyse a big corpus of text.

By including the github option, I hope to enable the ProjectTemplate community to easily share analysis work flows among each other. It also allows the functionality of ProjectTemplate to be extended and customised for lots of different situations without needing to have the core code updated.

If you would like to test drive this feature while this PR is under review, you can get the functionality by

library(devtools)
install_github("connectedblue/ProjectTemplate2@custom")

** restart R
library(ProjectTemplate)

There's a lot more beside what I've described here. Type ?templates to get the full feature set, including special handling for global.dcf templates.

I look forward to seeing comments and feedback.

KentonWhite commented 7 years ago

This is going to take me a bit of time to review. My main computer's video card failed this weekend and is now in the shop for a few days. When I'm back up and running looking forward to digging into this!

connectedblue commented 7 years ago

Thanks @KentonWhite. I'm conscious this is a big feature and will take a while. Look forward to your comments.

KentonWhite commented 7 years ago

I Love the idea of this feature. The problem is that it may violate the terms for hosting on CRAN:

We've gotten some leeway on the create.project() function but I'm not sure if the CRAN maintainers will look favourably on adding more features that write to disk.

Can this be modified so that the ability to create and store a template is separate from the package?

In the meantime I'll reach out to the CRAN maintainers and see how they might feel about the templates function.

connectedblue commented 7 years ago

Hi @KentonWhite

We've got some options - the reason to write into R/etc to preserve configuration as long as possible. It would be good to get the CRAN maintainer view.

Is it allowed to store the config inside the ProjectTemplate package itself? That was where I chose myself first of all, until I found quite quickly gets trashed with a new install - I install new versions several times a week, but realise I'm an edge case :). If we can put it there, we may be able to use the back up and restore function to somehow preserve between installs - not ideal, but do-able.

Another option is to use an environment variable. The location of the root template file is then specified by the user. This has the advantage that the file is preserved between R upgrades, and also could be backed up by the user if required. The downside is more complication for the user, and having to set up .RProfile files etc.

I assume they don't have a problem with a user function writing something to a file if it's included as a parameter to a function rather than just magically behind the scenes? Functions in the utils package like write.table() allow files to be written to disk.

A belt and braces approach would be to include an additional parameter root.template in the create.project() and template() calls to allow the user to specify the location if they want. The default value would be file.path(Sys.getenv("R_USER"), ".ProjectTemplateRootTemplate.dcf") . That would work out the box for most users, but gives them an option to change if they want.

What do you think?

connectedblue commented 7 years ago

@KentonWhite having thought some more, we could create a separate github hosted package containing the files customtemplate.R and template.R. We then could load it on the fly if the user accesses the template.name variable in create.project() (much as in the same vein that .require.package does in the readers, although the destination will be hard-coded in this case).

It's not ideal because we'd have to do version checking to keep things in step.

connectedblue commented 7 years ago

There's a related point on the location of the root template file which is worth debating now while we look at these other options.

One thing I thought was good about R/etc is that it is a site wide configuration, so all users on a system get the same templates. This does restrict it thought to users who have admin rights to installing templates.

If it is stored in the user's home directory then each user can create their own library of templates without reference to an admin. Of course, the templates themselves could still be shared on a local filesystem or github.

Regardless of how the CRAN issue gets resolved, I think I'm veering towards the home directory for the location so it persists beyond package/R updates and can also be backed up.

connectedblue commented 7 years ago

I'm going to close this PR for now.

I think the plugin package is going to work best to keep core ProjectTemplate clean and compliant.

I'll split into two and re-raise a PR with a proposed generic hook architecture.