Al-Murphy / MungeSumstats

Rapid standardisation and quality control of GWAS or QTL summary statistics
https://doi.org/doi:10.18129/B9.bioc.MungeSumstats
75 stars 16 forks source link

Tag releases for improved reproducibility when installing MSS in Docker containers #141

Closed jksull closed 1 year ago

jksull commented 1 year ago

Hi MSS Devs! Thanks once more for the valuable tool that you are developing, it is making life a lot easier working with GWAS summary statistics in many ways!

Rather than waiting for stable Bioc releases (at the moment: 1.6.0, which is relatively behind the most recent version in production: 1.7.16), currently I am incorporating MSS into a Docker container and the most effective way to do so is by installing the github repo using the standard method:

install_github(
  repo,
  ref = "HEAD",  # NOTE: can be commit, tag or banch
  ...
)

I would like to kindly request that MSS release on github are tagged with semantic versioning. This would be in line with current best practices as motivated in github's documentation, and would help to ensure reproducibility. Currently we provide the commit hash which does work, but makes things prone to rebasing and rewriting of history (although I very much doubt you have the intention to do that at any point!). Considering that it seems the MSS version increments correlate to every new commit hash, I hope that it wouldn't be too much of an overhead to tag releases going forward.

I look forward to hearing your thoughts on this!

Thanks and best regards :)

Al-Murphy commented 1 year ago

Hey! This is not something I've done before, can you explain how it works?

We do provide a docker image for MSS which you could use, if you use the latest tag there you will get the most up-to-date dev version. This has the added bonus that the image will not push if the package is failing github actions checks (which cover multiple OS). Note the image is a little behind which sometimes happens when GHA is failing due to dependencies being added which are new versions of packages which haven't propagated just yet. For example it is happening now because MSS now relies on the latest version of rtracklayer (>= 1.59.1), this should sort itself out in time.

Overall, I think it's a bad idea to just pull the latest on github as the code may not been tested on multiple platforms so could break - this also goes against bioconductor's view of package dev and usage. Another option is to install bioconductor dev (they provide a docker image for that too) and then install MSS in there, this will be the latest built MSS dev version, corresponding to the master branch on github but it has the added protection of passing bioconductor checks before being built. This is the best way in my opinion.

Al-Murphy commented 1 year ago

Closing because of inactivity, reopen if you want to discuss further.