PittGenomics / Swag

Scalable Workflows for Analyzing Genomes
Apache License 2.0
4 stars 0 forks source link

Make it easier to reproduce old workflows with the same package versions #1

Open annawoodard opened 6 years ago

annawoodard commented 6 years ago

We currently have a nice interface for automatically installing dependencies according to requirements (packages + versions) specified in the Swag code. We should solve this common issue:

1) You run a workflow with a certain set of packages and versions (say, according to the defaults in Swag). 2) A significant amount of time passes. 3) You want to exactly reproduce the workflow you ran in step 1) with a fresh install of Swag, but now the package versions corresponding to a fresh Swag install are different.

In other words, I think we should allow the user to decouple the bio package versions with the Swag version. The simplest possibility I can think of is that when you run swag install-env, you get both the current executables.config (with paths to executables) and a requirements.txt or equivalent file which contains the package names + versions that you can save and pass to a subsequent call of swag install-env in order to exactly reproduce the package versions.

annawoodard commented 6 years ago

Maybe my motivation is not well-formulated, because the only way to guarantee you reproduce the old workflow is to run with the same version of Swag as well. But if we give users the freedom to run with arbitrary package versions (by editing executables.config to point to whatever software they wish), it seems useful to give them a portable way to reproduce it.