broadinstitute / pooled-cell-painting-profiling-recipe

:woman_cook: Recipe repository for image-based profiling of Pooled Cell Painting experiments
BSD 3-Clause "New" or "Revised" License
6 stars 4 forks source link

Project scaffold vs workflow module #3

Closed shntnu closed 4 years ago

shntnu commented 4 years ago

We need two distinct functionalities:

  1. A workflow submodule, which contains all the logic, i.e., code + configurations + instructions for reproducing a workflow
  2. A project scaffold, which defines other aspects, excluding logic, that will help standardize projects. This could include license files, README, directory structures, etc.

Because we cannot at present pull from a template repository, GitHub templates are ok for 2. but not for 1.

So instead we propose to create a regular repo to define the workflow submodule, which will then be a submodule of future project repositories.

Project scaffolds can however be created using GitHub templates, and are very useful for this purpose, because we don't need / want the functionality to pull from the template.

@gwaygenomics Does this sound right?

shntnu commented 4 years ago

A project scaffold might actually be better defined via a package like usethis

gwaybio commented 4 years ago

Yes, this is exactly as I envision it. The scaffold template will store instructions on how to initialize and update the workflow submodule.

I've never used usethis before so I don't know how the scaffold would work using it. Is it easy to update versions of the workflow module and is the commit hash explicit in usethis? This is the advantage of git submodules, but maybe usethis includes these advantages and more.

shntnu commented 4 years ago

I've never used usethis before so I don't know how the scaffold would work using it. Is it easy to update versions of the workflow module and is the commit hash explicit in usethis? This is the advantage of git submodules, but maybe usethis includes these advantages and more.

A clarification to make sure we are on the same page

We'd need to create a package like usethis; the current package serves a very specific need (creating R packages, although the scope has expanded now).

So what I was proposing is that instead of codifying the project structure via a template, we could instead do it via a package; this would increase flexibility (you can pick and choose what aspects of the template you want to use) and would not be GitHub / git dependent. However it is certainly a lot more work! Now that I write this out, I think we should stick with GitHub templates for implementing project scaffolds. We'd rather spend that energy on implementing workflow submodules efficiently. We certainly don't want to end up reinventing a complete workflow system :)

gwaybio commented 4 years ago

So what I was proposing is that instead of codifying the project structure via a template, we could instead do it via a package; this would increase flexibility (you can pick and choose what aspects of the template you want to use) and would not be GitHub / git dependent. However it is certainly a lot more work! Now that I write this out, I think we should stick with GitHub templates for implementing project scaffolds. We'd rather spend that energy on implementing workflow submodules efficiently. We certainly don't want to end up reinventing a complete workflow system :)

Thanks for these thoughts @shntnu - I agree and am on board! The package idea is intriguing and could be something we keep in the back of our thoughts.

gwaybio commented 4 years ago

closing in favor of the submodule approach in https://github.com/broadinstitute/pooled-cell-painting-profiling-template

Indeed a package would be super (and much more easily distributed) so it'll be great to refer back to this issue once we revisit.