Closed snoyberg closed 8 years ago
CC'ing @davean who manages our Fastly CDN configuration
More generally, we may be able reduce the time-window in which a client can experience an inconsistent view of the Hackage repository temporarily, but we will never be able to fully eliminate that window, unless the CDN or any cloud storage provides stronger guarantees. IOW, There's no way around Hackage clients needing to be able to cope with transient object access failures as long as the communication path is made up of potentially unreliable components. When cabal
is used with hackage-security
's object retrieval logic this is taken into account to some degree already.
Yes, can't be eliminated completely but reduced caching of 404s sounds like a good idea.
Just to give a little more detail of what I'm doing in case it helps: instead of running through the full mirroring scripts every 1/5/10 minutes, I'm moving over to a watch script which respects the ETag
header, so that less bandwidth/CPU/disk access is used. You can see this script at:
https://github.com/fpco/hackage-mirror/blob/383666ff71c1fcafecd2d0a1a72a47e75d5fcda3/main/Watcher.hs
Due to this issue, I've included an arbitrary 10-run forced synchronization in case the index.tar.gz file got out-of-sync with the sdist tarballs.
One last detail: it may seem like it would be reasonable to just confirm that all tarballs are available instead of using the arbitrary 10-run cutoff. Unfortunately, there's another issue that prevents that from being possible: #436. Since there are some tarballs which legitimately fail the download (because they have been deleted for copyright purposes, but not removed from the index), and other tarballs that fail due to CDN caching issues, I don't see a way to detect that we should ignore the ETag
and try synchronizing tarballs again.
Re removed package tarballs, we try to return the appropriate HTTP code "410" (rather than a vague 404), e.g.:
$ curl -v http://hackage.haskell.org/package/tslib-0.1.4/tslib-0.1.4.tar.gz
* Hostname was NOT found in DNS cache
* Trying 151.101.16.68...
* Connected to hackage.haskell.org (151.101.16.68) port 80 (#0)
> GET /package/tslib-0.1.4/tslib-0.1.4.tar.gz HTTP/1.1
> User-Agent: curl/7.38.0
> Host: hackage.haskell.org
> Accept: */*
>
< HTTP/1.1 410 Gone
* Server nginx/1.8.1 is not blacklisted
< Server: nginx/1.8.1
< Content-Type: text/html
< Via: 1.1 varnish
< Fastly-Debug-Digest: 749f11a1c96d94f23d058f12dcb79a1371a81af9135ce285d3e7df73fba27495
< Content-Length: 158
< Accept-Ranges: bytes
< Date: Tue, 13 Sep 2016 08:54:25 GMT
< Via: 1.1 varnish
< Age: 16
< Connection: keep-alive
< X-Served-By: cache-dfw1832-DFW, cache-lcy1132-LCY
< X-Cache: HIT, HIT
< X-Cache-Hits: 1, 1
< X-Timer: S1473756865.631754,VS0,VE0
<
<html>
<head><title>410 Gone</title></head>
<body bgcolor="white">
<center><h1>410 Gone</h1></center>
<hr><center>nginx/1.8.1</center>
</body>
</html>
* Connection #0 to host hackage.haskell.org left intact
I never noticed that. That could be very useful, thanks!
@snoyberg since you're running a mirror, it'd be perfectly reasonable to bypass the CDN entirely. Then you get to choose if/how to respect the cache-control hints etc. If you'd like to do that, let us know and we can give you the details (ie IP address etc).
Also, if you'd like to take part in the public mirroring of hackage (ie serving in the same original format) then you may like to use https://github.com/hvr/hackage-mirror-tool and optionally have your mirror added to the public mirror list http://hackage.haskell.org/mirrors.json . If so, just let us know.
Update: @snoyberg has set up a new mirror and it is now listed as an official public mirror in the upstream http://hackage.haskell.org/mirrors.json
Since the out-of-sync caching/proxying issue does not at appear to be a problem for cabal
clients at the moment then we'll close this for now. The hackage-security client code has logic to cope with caching proxies but if this proves not enough for our CDN then we can switch things around so that we use our mirrors as primaries for clients rather than only as secondary / backups.
Just to give one last note on all of this: I put a new page on stackage.org to track the relative up-to-dateness of Hackage vs mirrors and Git repos, you can see it at:
https://ci.stackage.org/status/mirror
I've configured the page to return a status 500 if the lag time is ever more than an hour, so using normal HTTP monitoring tools can give an alert if the mirroring functionality ever stops working.
@snoyberg that's coincidentally something similar to something half-finished (sans the Git repos status) that I've been hacking on as well, as we needed that for haskell.org as well... except less html'y, just a plain/text .cgi script which validates the TUF meta-data for freshness :-)
If it would be helpful to add a few more URLs to that table, just say so. It's no big deal for me too track the last-modified of a few more files.
@snoyberg it may be interesting to add "http://objects-us-west-1.dream.io/hackage-mirror/01-index.tar.gz" there, as well as the ../timestamp.json
files (since that one's updated last by my tool)
Cool, commit pushed, should be live in a few minutes.
On Wed, Sep 21, 2016 at 11:13 PM, Herbert Valerio Riedel < notifications@github.com> wrote:
@snoyberg https://github.com/snoyberg it may be interesting to add " http://objects-us-west-1.dream.io/hackage-mirror/01-index.tar.gz" there, as well as the ../timestamp.json files (since that one's updated last by my tool)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/haskell/hackage-server/issues/537#issuecomment-248729023, or mute the thread https://github.com/notifications/unsubscribe-auth/AADBB-RitPpGnDbtaZ2jiDNDLiC3-2Vuks5qsY_OgaJpZM4J58pA .
The mirror I've been running went down about 8 hours ago (see: https://github.com/commercialhaskell/all-cabal-hashes/issues/13). AFAICT, the problem is that the privately provided IP address for the upstream server (behind the CDN) changed. I've switched the mirror to use hackage-origin.haskell.org, is that correct?
That should be correct, yes.
I've experienced this personally in running the all-cabal-hashes mirror, and have received user reports. Relevant links:
The idea is: you download the
00-index.tar.gz
file from Hackage (e.g., viacabal update
), and it includes a.cabal
file for a certain package/version combo (likeyaml-0.8.18.6.cabal
). But when you try to downloadyaml-0.8.18.6.tar.gz
from Hackage, you get a 404 for a while, which eventually corrects itself. I've experienced situations where two different build servers - both in the US - returned a 404, while downloading from my house in Israel worked. This leads me to believe it's a regional caching issue with the CDN.Just a complete guess here: perhaps it's worth disabling CDN caching for non-200 responses?