Closed luispedro closed 6 years ago
This is very promising, thanks! In order to get unbiased results, it would be important to only consider packages where the current version is not the first one. The reason is that we only want to measure the delay for a new version to be picked if the package is already in the repo. Otherwise, we get arbitrary biases by the delay between initial inclusion and the last release of the software.
You are right that we need to be careful with older packages. I had set an arbitrary cut-off for packages earlier than 2016, but you are right that only considering packages with >1 bioconda version is a better system. I will reimplement it like that.
(I will also fix the other issues).
Updated results (now only considering packages with >1 version on bioconda):
Mean (+/- std.dev.) number of days between upstream release and bioconda package: 83.1293103448 +/- 99.1218653899
Median number of days between upstream release and bioconda package: 37.0
Based on 696 packages.
On those packages where it could be heuristically determined, 160 of 217 are current with their upstream release.
Following-up on this (and this recent piece of info: https://github.com/bioconda/bioconda-recipes/issues/6323#issuecomment-347994955):
Is this OK now? The big issue of which packages to count has been solved.
Yes, I think it is fine. I will merge it and we will keep it in mind for the first revision.
As discussed on authorea, this looks at the delay between the upstream release and the bioconda package.
This does not work for all packages, but it works for >90% of them. Packages that refer to a particular git commit are ignored (I can also change the code to use the date of that commit).
I then summarized the results (while ignoring upstream packages from before 2016). These are the results I get:
(If this gets incorporated into the paper, then I would like to be an author [Luis Pedro Coelho, Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany]. I have also added a recent package to bioconda, so I guess that would also qualify me, except I might have missed that deadline).
As discussed, I also got a (very partial) list of which packages are outdated wrt their upstream releases.
*
BTW, I wasn't sure at all where to put these scripts in this repo. This seemed the most logical place, but let me know if you prefer me to move them.