KentonWhite / ProjectTemplate

A template utility for R projects that provides a skeletal project.
http://projecttemplate.net
GNU General Public License v3.0
623 stars 159 forks source link

Review creating new directory structure at project creation and migration #179

Closed Hugovdberg closed 7 years ago

Hugovdberg commented 7 years ago

Issue #177 proposes a new name for the doc/ directory to docs/ to enable easy integration with GitHub Pages. A change was suggested to add renaming the folder to migrate.project() as well as the purpose of that function is to get the project up to the same level as after a call to create.project(). However, at this moment new directories added to the skeleton structure aren't created at all during migration. This is partially because it is possible to call create.project with the minimal = TRUE option, which does not create all directories from the skeleton. This option should be honoured during project migration, but at the moment it is hard to reliably detect what setting was used. It would be best I think to store the setting during project creation, and be able to overrule it in migrate.project so an existing minimal project can be upgraded to a full project.

Also, merge.strategy = allow.non.conflict could perhaps be made more lenient because even if only an empty directory exists with a conflicting name (which would not mean any data loss if overwritten) the create.project() functions raises an error. I think it would be nice if it would only fail on overwriting files.

Combining all these remarks I think we should refactor create.project and migrate.project to call the same functions in the background, exposing a similar signature. Consider the following pseudo code:

create.project <- function(project.name = '.', minimal = FALSE, dump = FALSE,
                           merge.strategy = c("require.empty", "allow.non.conflict")) {
  if (!dir.exists(project.name)) {
    dir.create(project.name)
    setwd(project.name)
    .update.project.structure(minimal, dump, is.update = FALSE)
  } else {
    setwd(project.name)
    if (allow.non.conflict) {
      compare.existing.files
    }
    .update.project.structure(minimal, dump, is.update = FALSE)
  }
}

migrate.project <- function(minimal = .is.minimal.project()) {
  .update.project.structure(minimal)
}

.update.project.structure <- function(minimal, dump, is.update = TRUE) {
  if (minimal) {
    exclude dirs from skeleton
  }  
  for (dir in skeleton) {
    if (not exists dir) {
      create dir
      if (is.update) {
        message dir created
      }
    }
  }
  for (file in skeleton) {
    if (not exists file) {
      copy file
      if (is.update) {
        message file created
      }
    }
  }
  if (dump) {
    dump files
  }
  for (setting in skeleton) {
    if (not exists setting) {
      add setting
      if (is.update) {
        message setting created
      }
    }
  }
  if (is.update) {
    for (setting in settings) {
      if (setting not in skeleton) {
        delete setting
        message setting deleted
      }
    }
  }
  update setting minimal
}
connectedblue commented 7 years ago

I'm not sure that the comments raised in #177 are sufficient reason to re-factor as you suggest. Is there a wider issue that would also be addressed? So far migrate.project only tinkers with the configuration settings to make sure they are up to date - the docs debate was about whether that specific folder needed special migration treatment.

Perhaps there needs to be a greater distinction between migrating a project (ie making old projects compatible with the latest ProjectTemplate) and upgrading a project to give the skeleton more features. Is that the issue that needs to be considered?

I must admit though, I do find merge.strategy a confusing parameter - my understanding of it fades in and out. I never use it in practice. Also I'm unsure of the purpose of dump.

Perhaps there is a need for a convert.project to go from minimal to full?

However, I would also throw into the mix #176 (closed temporarily in order to re-think how it can be done that doesn't violate CRAN requirements). The custom template feature would allow for many many variants of a ProjectTemplate structure that we couldn't hope to take account of. In fact, in a custom template world, the distinction between minimal and full diminishes further.

Hugovdberg commented 7 years ago

@KentonWhite mentioned the migrate.project function should bring a project up to date with the latest ProjectTemplate structure, as if it were based on a recent call to create.project. So that part of the discussion boils down to the question of the directory structure needing migration as well as the settings, which I think it should.

It is good you mention the custom templates here, I think it would be a very good reason for this refactoring. You could incorporate the templates in this structure as well, and select the skeleton to migrate to as your template. Minimal and full would then just be two templates installed by default, while other templates could be pointed to (isn't this as simple as creating a template structure somewhere on the (shared) filesystem and point to the root of that skeleton? If people want a shortcut to it they can add variable to their profile). This means migrate.project could also be used to transition between templates as well.

Personally I do use merge.strategy = "allow.non.conflict" a lot. Often I create a new project in a new directory from RStudio, with git enabled, and then in the new directory I call create.project('.', merge.strategy = "allow.non.conflict"). Perhaps it should be changed to a new argument merge = FALSE (don't allow directory to exist), and when set to true just create the new folders.

I included the dump variable as it is in the current version, but it says in the source that some magic happens which is seriously broken. The real purpose is unclear, and I'm not sure if people would miss the feature.

Below an updated proposal for the new structure.

create.project <- function(project.name = '.', template = 'full',
                           merge = FALSE) {
  if (!dir.exists(project.name)) {
    dir.create(project.name)
    setwd(project.name)
    .update.project.structure(template, is.update = FALSE)
  } else {
    setwd(project.name)
    if (merge) {
      compare.existing.files(template)
    }
    .update.project.structure(template, is.update = FALSE)
  }
}

migrate.project <- function(template = .current.template(), overwrite.settings = FALSE) {
  return(invisible(.update.project.structure(template, overwrite.settings)))
}

.update.project.structure <- function(template, overwrite.settings = FALSE, is.update = TRUE) {
  skeleton <- .get.template(template)
  for (dir in skeleton) {
    if (not exists dir) {
      create dir
      if (is.update) {
        message dir created
      }
    }
  }
  for (file in skeleton) {
    if (not exists file) {
      copy file
      if (is.update) {
        m  essage file created
      }
    }
  }
  for (setting in skeleton) {
    if (not exists setting) {
      add setting
      if (is.update) {
        message setting created
      }
    }
  }
  if (is.update) {
    for (setting in settings) {
      if (setting not in skeleton) {
        delete setting
        message setting deleted
      } else if (overwrite.settings) {
        update setting
        message setting updated
      }
    }
  }
  update setting template
  return updated files
}

.get.template <- function(template = 'full') {
  if (template in (full, minimal) {
    skeleton <- system.file(template, package = 'ProjectTemplate')
  } else {
    if (.is.skeleton(template)) {
      skeleton <- template
    } else {
      error template is not valid skeleton
    }
  }
  return skeleton
}
Hugovdberg commented 7 years ago

I think it would be very well possible to include a hook to your custom templates package you mention to create in #176 from within the .get.template function I propose above. That way your not-CRAN-compatible functions can help people setup and manage their custom templates if it is available on the system, while keeping almost the same functionality available without treading the CRAN rules more than currently done. Also, by adding the updates to the directory structure it becomes easier to keep custom templates up to date with the base project template. By adding the template to the settings it also nicely available during upgrades. Which solves the issue mentioned before that migrate.project should obey the previously used template.

connectedblue commented 7 years ago

Yes, I'm looking for a way to allow users to manage templates in the way you suggest.

I need to think about how this would practically work.

I'll post a new PR very soon with a proposal for comment.

Hugovdberg commented 7 years ago

Perhaps change my proposed .get.template to something like:

.get.template <- function(template = 'full') {
  if (template in (full, minimal) {
    skeleton <- system.file(template, package = 'ProjectTemplate')
  } elseif (!is.null(getOption("ProjectTemplate.templates.dir")) &&
            .is.skeleton(file.path(getOption("ProjectTemplate.templates.dir"), template)) {
    skeleton <- file.path(getOption("ProjectTemplate.templates.dir"), template)
  } elseif (.is.skeleton(template)) {
    skeleton <- template
  } else {
    error template is not valid skeleton
  }
  }
  return skeleton
}

.is.skeleton <- function (template) {
  check template is a directory containing at least minimal structure
}

That wouldn't need a new package for the template management, simply a global option that points to a directory containing the templates. To me it seems this refactored structure does indeed provide the custom template functionality without further violating CRAN requirements. Also create.template(template) would be more or less the same as create.project(file.path(getOption("ProjectTemplate.templates.dir"), template), template = 'minimal').

KentonWhite commented 7 years ago

@Hugovdberg @connectedblue Can we close this issue? Sounds like this refactoring will be addressed with the custom templates!

Hugovdberg commented 7 years ago

Let's wait for the new PR indeed.