Bioconductor / GenomeInfoDbData

GenomeInfoDbData
2 stars 0 forks source link

Different versions of GenomeInfoDbData 0.99.1 available to download from bioconductor.org #1

Closed jdblischak closed 6 years ago

jdblischak commented 6 years ago

We are having trouble creating a conda package for GenomeInfoDbData because we download different tarballs for version 0.99.1 depending on where in the world the download happens.

From Dortmund, Germany, the tarball downloaded from http://bioconductor.org/packages/3.6/data/annotation/src/contrib/GenomeInfoDbData_0.99.1.tar.gz has md5 checksum 85741ea07f8661079dbd77a23aff3b6b. From Chicago, the tarball downloaded from the same URL has md5 checksum 47c68c7ba79de1c309a5b6009bbc0c71.

Were two different tarballs uploaded at different times? Any insight/advice would be much appreciated. Thanks!

For more details, see our discussion starting at https://github.com/bioconda/bioconda-utils/pull/240#issuecomment-351146581.

cc @mbargull

mbargull commented 6 years ago

Maybe you might also want to consider lowering the Cache-Control: max-age parameter (currently 3 months) for CloudFront's caches for http://bioconductor.org, see https://github.com/bioconda/bioconda-utils/pull/240#issuecomment-351154631.

vobencha commented 6 years ago

Thanks for the message. I'm looking into it. Valerie

hpages commented 6 years ago

FWIW the Bioconductor package repositories are originally on master.bioconductor.org. For http://bioconductor.org/packages/3.6/data/annotation/src/contrib/GenomeInfoDbData_0.99.1.tar.gz, the md5 checksum I get by going on master.bioconductor.org and running md5sum directly on the tarball there is:

webadmin@ip-172-30-4-20:/extra/www/bioc/packages/3.6/data/annotation/src/contrib$ md5sum GenomeInfoDbData_0.99.1.tar.gz 
85741ea07f8661079dbd77a23aff3b6b  GenomeInfoDbData_0.99.1.tar.gz

i.e. same as what you get from Dortmund, Germany.

mtmorgan commented 6 years ago

(Stealing some of Val's thunder...) I guess @hpages is pointing out the currently correct tarball. In terms of access, the problem at master is no better -- it has

 $ curl -I http://master.bioconductor.org/packages/3.6/data/annotation/src/contrib/GenomeInfoDbData_0.99.1.tar.gz 
HTTP/1.1 200 OK
Date: Thu, 14 Dec 2017 02:07:42 GMT
Server: Apache/2.4.18 (Ubuntu)
Last-Modified: Thu, 26 Oct 2017 14:20:33 GMT
ETag: "111d3be-55c73e00fe54c"
Accept-Ranges: bytes
Content-Length: 17945534
Cache-Control: max-age=15552000
Content-Type: application/x-gzip

So an earlier tarball with the same version number would still be interpreted as 'current' by some (apparently Chicago) clients. It seems like we pushed different tarballs with the same version number, probably to 3.6 (when it was the devel branch) after the spring release, and a different tarball to 3.6 (as it was about to become the release branch) just before the fall release.

The long max-age seems to have been premised on the assumption that we'd never push different tar balls with the same version number; the max-age is set for tar.gz, .zip, and .tgz files (i.e., package archives) but not PACKAGES (for discovering current versions in each release) and other web site files. For our 'software' and 'experiment data' packages there are features in the build system that make it less likely for us to push different tarballs with the same version; our annotation packages are handled differently and more subject to human error.

vobencha commented 6 years ago

I was preparing an answer but all has been addressed above. We'll bump the package versions and the new versions will propagate to the edge nodes.

jdblischak commented 6 years ago

Thanks for the quick response!

We'll bump the package versions and the new versions will propagate to the edge nodes.

@vobencha Will this new version be added to the 3.6 release? If yes, could you please post to this Issue when it is available for download?

The long max-age seems to have been premised on the assumption that we'd never push different tar balls with the same version number

@mtmorgan Since it happened in at least this one case, I worry it may have happened before and could happen again. It was unclear to me from your response if you are in favor or not of reducing the max-age setting.

vobencha commented 6 years ago

Yes, the updated version will be in release 3.6. I'll let you know when we've bumped and what the version is.

mbargull commented 6 years ago

Thanks to everyone for looking into this and giving those explanations! (cc @bgruening: I believe this can be valuable information to you in regards to Cargo Port.)

jdblischak commented 6 years ago

The 1.0.0 version is now available on bioconductor.org:

https://www.bioconductor.org/packages/release/data/annotation/src/contrib/GenomeInfoDbData_1.0.0.tar.gz

Thanks for your help!