Open U8NWXD opened 3 years ago
Well said, @U8NWXD!
The primary goal is to allow people to run the code to reproduce the published results.
The secondary goal to allow them to read, understand, and tinker with the code.
Much has been written about program reproducibility and it's far from a solved problem esp. with floating point math. Frankly Python and its libraries aren't built with this in mind. Do what you can to increase code reproducibility.
wcEcoli
repo and committed those (not even a squash merge).requirements.txt
updates from GitHub's @dependabot to get library security patches.
jinja2==2.11.3
) since that release is incompatible with Python 2.PyYAML==5.4
. We need to figure out if that library update won't disturb this code snapshot or just reject it.v1.0
titled Science-2020-07-24
, with this description:
This release snapshot
Science-2020-07-24
goes with the paper Simultaneous cross-evaluation of heterogeneous E. coli datasets via mechanistic simulation published in Science, 24 July 2020. See docs/README.md for info on setting up the Python 2.7 runtime environment to run this release. (The next release will contain lots of work done since this snapshot forked off, and it runs on Python 3.8.)
- I used the GitHub web UI to make the tag and the release. It takes care of PGP signing and saves the title somewhere outside the git tag object.
This seems like a good pattern for a new release for a new published paper:
v
+ semver major version number (next: v2.0
),v1.0
) and the release name (Science-2020-07-24
), and not by a git commit number.[Any changes or additions to this procedure?]
master
) in a single repo (wcEcoli
) so a linear release history will work.
operon
branch without first merging that into master
, then we'll need to create either a branch in the WholeCellEcoliRelease
repo or a separate release repo.wcEcoli
repo, so there's probably no reason to put alpha releases in the release repo. "alpha" and "beta" are QA terms. A lot of the industry is confused about this.Changelogs are very useful but when making a new snapshot associated with a new published article, maybe we can settle for a high level summary.
Since I'm figuring out how to do releases of the whole-cell model, I took a pass at writing up a plan for future releases. Here's my proposal:
Releasing New Versions of the Model
We release new versions of the model to the WholeCellEcoliRelease repository whenever someone in the lab publishes a paper that requires unreleased model code.
Tracking Versions
Version Numbers
We use semantic versioning for our version numbers, except we drop the patch number. In broad strokes, this means that our version numbers take the form
major.minor
, for example1.0
. We can also specify pre-releases like1.0-beta.1
. For any versions that include breaking changes (i.e. if someone else wrote code that uses our public methods, that code should still work), we increment the major version. For all other (i.e. backwards-compatible) changes, we increment the minor version. New major releases will generally go along with papers, while new minor releases will usually contain minor bug fixes.Commits and Tags
When we release the model, we usually squash all our commits into a single release commit. This keeps the commit messages in
wcEcoli
private. However, this does not mean have one commit per release. For example, we might add commits to fix bugs or update documentation without doing a new release. You also might want to split the release for your paper across multiple commits. For example, if some of your data were generated using an earlier version of the model, you might want to include 2 commits: one that includes changes up to that earlier version and one for the rest of the changes. That way, you can refer to your versions my commit hashes in your paper.Instead of tracking versions with commits, we track them with tags. These tags are named with the version number, e.g.
v1.0
and associated with releases on GitHub. It's good to include in the tag message a description of what the release is for. Then, you can specify the tag in your paper. Tracking versions with tags has a number of benefits:Pre-Releases
When submitting a paper for review, you might want to make code available to reviewers without making a new release. For example, you might want to address reviewer comments before you make a new release. To handle this, we create pre-releases. These are versions just like those described above, except they have
alpha
orbeta
added to the end to signal that they are not yet complete. For example, let's say you're making a big new release that will bev3.0
. You could createv3.0-beta.1
and make that available to reviewers. Then, you could address their comments inv3.0-beta.2
. Once the paper's accepted and you've made any last changes, you can releasev3.0
. When you create the releases forv3.0-beta.1
andv3.0-beta.2
on GitHub, you can specify it as a pre-release so that GitHub marks it as such. This will tell users you aren't ready for them to use it yet.If you want to avoid putting your code into the
WholeCellEcoliRelease
repository until after review, you can create a new temporary repository just for reviewers. One easy way to do this is to clone theWholeCellEcoliRelease
repository and add the temporary repository as another remote. Then you can set up your tags and push to the temporary repository. Once the paper is accepted, you can push to theWholeCellEcoliRelease
repository to make your releases public.Other Considerations