Closed ahasna closed 1 year ago
Maybe related to cache filling on a miss in the CDN, the two simultaneous downloads from the same IP fetch a partially completed cached file?
That could be it, Since I am using Github-hosted runners so the public I might be reused between jobs.
The weird thing is that for previous commits it works fine. I am not a cURL
expert, but I think the -H user-agent:tailscale-github-action
is causing this behaviour. Is that assumption valid?
We're debugging this now. So far as far as we can tell, the checksum error is because our CDN responded with a 404, and obviously the body of that 404 does not pass the checksum for the package.
The user agent setting doesn't cause this directly, it just signals to the package server that even though the client is curl, it will follow redirects and so it's safe to serve it a CDN redirect.
For now we've disabled CDN serving, so things should be reliable (but slower) while we debug.
The issue is fixed. There was an edge case where the CDN would try to serve from cache at the same time as the file got evicted from cache, and we'd end up serving a 404 rather than recover gracefully. We now handle that properly. I've reenabled CDN serving.
Great work! Thank you very much!
this is happening again today, fwiw:
https://github.com/tailscale/github-action/issues/89#issuecomment-1745088958
When running multiple instances of this action in one workflow, some are passing fine but others are failing with the following error.
Using a commit before: https://github.com/tailscale/github-action/commit/4e4c49acaa9818630ce0bd7a564372c17e33fb4d works fine.
I tried to replicate this locally by running the following commands with no luck:
@creachadair I am suspecting the latest change has something to do with it. Any ideas how to debug further? Happy to help fix if you could point me in the right direction.