Closed alexeagle closed 2 years ago
While we are updating our SHAs, we should probably just migrate to Gitlab, right?
anecdotal and i cannot point to anything specific currently, but i have seen some "compression changed on generated tarballs" on gitlab too (between some updates). i don't think any of these systems are designed with a perfect guarantee for this ("this" being hash-stable tarballs), we just got here by circumstance.
While we are updating our SHAs, we should probably just migrate to Gitlab, right? Is that the current recommendation for people who want more reliability?
@bk2204 https://support.github.com/ticket/personal/0/1485189 (not visible publicly) made a clear commitment that the /archive/refs/tags/$tag
endpoint would provide archives with stable hashes and should be relied upon for that purpose. I specifically asked for confirmation of this twice and received it. Happy to share the full conversation I had with support.
Hey folks. I'm the product manager for Git at GitHub. We're sorry for the breakage, we're reverting the change, and we'll communicate better about such changes in the future (including timelines).
did GitHub remove the staff tag from profiles? there's allegedly like 4 staff members in this thread and no one has the badge
We updated our Git version which made this change for the reasons explained. At the time we didn't foresee the impact. We're quickly rolling back the change now, as it's clear we need to look at this more closely to see if we can make the changes in a less disruptive way. Thanks for letting us know.
Also re: Staff badge, here's what I see in this thread:
Meta, @vtbassmatt this is what we normal users see.
this is what normal users see
Huh! I don't work on frontend stuff, so that's a mystery to me 😅
@vtbassmatt awesome thank you kindly ❤️
Will github provide stability guarantees around the non release tarball / zip URLs going forward?
Thank you, @vtbassmatt. May I suggest a regression test for this?
@vtbassmatt For those of us who dutifully updated our checksums in response to the original change, can you give us a timeline for the rollback so we can try to time rolling back our updates? I totally understand we are in the minority and rolling back the change is the right move, but of course the new interface was live, Hyrum's law and all that.
@jerrymarino too soon to commit to anything in particular (except "we will DEFINITELY communicate better"). There are good reasons on both sides.
@jmacdonald-Ocient the rollback is imminent, just winding its way through automated testing. I don't know for sure how long it will take to show up, I'm sorry.
Thanks GitHub staff for the quick response, looking forward to the follow-up communication. Idea on better communication going forward - could you please add a hint in the UI right by the download links that links to docs on what is/isn't stable and possibly best practices on getting stable artifacts and checksumming them? e.g. a help icon you can hover over with a tooltip or that links to the docs.
IME this form of contextual help/communication is really beneficial for customers that may not follow the blog, think to search the docs, etc. as it's right in the point of use.
If the checksum isn't stable, after the community is migrated, I would recommend that a random value is injected every time to really drive this point home so that no one reacquires an incorrect dependency. Hyrum's Law shows that documenting an interface as unstable is insufficient if in practice it's largely stable.
Hey folks. I'm the product manager for Git at GitHub. We're sorry for the breakage, we're reverting the change, and we'll communicate better about such changes in the future (including timelines).
Oh wow, what a wild ride, i'm so glad you are reverting it 🙏
Please note that not everything can be migrated to pointing at a release package easily, a lot of the checksum errors I've experienced were in third party plugin/rule code we have no direct control over.
Some pointing to on-the-fly tar.gz generated source archive from specific historical revisions and such. obviously we'd do our part in migrating away and upgrading such dependencies but still this is quite a genuine threat and one that is hard to validate.
please take such concerns under consideration when rolling out a solution
Those files are generated new each time (with some caching - an hour I think). We told Git to use the old settings instead of its new default, so they’ll start getting generated with the old hashes again. I’m told the roll-out is complete, modulo resetting those caches.
FWIW, we're seeing more checksum mismatches in the last ~20 minutes than at any other time today.
Is it possible you invalidated a cache that was preventing some portion of artifacts from being regenerated, and now they are being regenerated before the rollback was complete?
@jfirebaugh that is indeed possible, we’ll look into it. Is it abating for you or still ongoing?
Ongoing
Here too -- all of our bazel builds died about four hours ago. I've been trying to band-aid it by updating hashes / moving repos to git_repository()
from http_archive()
but we are still seeing lots of issues since this incorrect use of http_archive()
is pervasive in bazel-land
FWIW, we're seeing more checksum mismatches in the last ~20 minutes than at any other time today.
Is it possible you invalidated a cache that was preventing some portion of artifacts from being regenerated, and now they are being regenerated before the rollback was complete?
Echoing this, we also see things getting worse in the past ~20min.
Still having issues with vcpkg and installing libs such as boost and yaml-cpp. Believe this was a related issue.
Think we found a bug in the rollback. Being fixed now.
A statuspage would have been easier to follow than updates on a github issue.
In our case, we had rules_python
mismatch which was fixed and everything worked for awhile.
But now, we are getting rules_java
mismatch and everything stops working.
/root/.cache/bazel/_bazel_root/2d3430b48bd77b69b91ab356ef9daf21/external/rules_java/temp5602810203644143040/981f06c3d2bd10225e85209904090eb7b5fb26bd.tar.gz: Checksum was 01581e873735f018321d84d2e03403f37f4b05cf19d9c8036f14466e584a46b9 but wanted f5a3e477e579231fca27bf202bb0e8fbe4fc6339d63b38ccb87c2760b533d1c3
Think we found a bug in the rollback. Being fixed now.
Any update on this fix?
I have been thinking about this problem for a while, "safe and comfortable in the knowledge that it will never break". :rofl:
So one good output of this actually occurring and then being reverted: I have gone ahead and actually posted an email to the git mailing list about the possible solution I've been thinking of for a while now: https://public-inbox.org/git/a812a664-67ea-c0ba-599f-cb79e2d96694@gmail.com/T/
I live in hope that we'll eventually see a world where the manpage for git-archive
says "git archive is reproducible and here is why", and then no one ever has to have this debate again.
Any update on this fix?
Should be deployed now. I spoke too soon. It’s in progress but not fully out.
Should be deployed now.
I'm still seeing the broken checksum values . Does this rollout also requiring waiting an hour for the caches to reset?
EDIT:
I spoke too soon. It’s in progress but not fully out.
Clicked respond right before I saw the edit. Thanks for the update!
I agree with the person above saying a status page would be better than comment updates, but I think it's important to note that it's still appreciated regardless - it's vastly more helpful than radio silence, which a lot of companies and teams would be giving in a similar position right now. Thanks for keeping us updated!
@eli-schwartz Posted to HN: https://news.ycombinator.com/item?id=34588880
Sorry for the false starts above, and I appreciate everyone’s patience with me. You should start seeing the old checksums now.
@vtbassmatt How does the rollback
work? Do you need to literally re-generate all the affected releases which would take a long time to finish?
You should start seeing the old checksums now.
@vtbassmatt do we have to wait for a cache eviction? Still seeing bad hashes here
$ curl -sL https://github.com/madler/zlib/archive/v1.2.11.tar.gz | sha256sum
9917faf62bc29a578c79a05c07ca949cfda6e50a1c8a02db6ac30c5ea0aba7c0 -
(Bazel thinks this is supposed to be 629380c90a77b964d896ed37163f5c3a34f6e6d897311f1df2a7016355c45eff
)
Doesn't look like the rollback is complete. For example, https://github.com/bazelbuild/rules_python/archive/refs/tags/0.16.1.tar.gz (https://github.com/bazelbuild/rules_python/releases/tag/0.16.1) still has the wrong (newer) checksum.
Thanks for the ping. This is unexpected and folks are looking at it immediately. I’ve got to step out of the thread now, but we do intend to revert to previous behavior.
@vtbassmatt I am having similar issues with GitHub Actions builds that are using npm to grab resources from GitHub. This is from about 1m ago
#14 7.808 npm WARN tarball tarball data for http2@https://github.com/node-apn/node-http2/archive/apn-2.1.4.tar.gz (sha512-ad4u4I88X9AcUgxCRW3RLnbh7xHWQ1f5HbrXa7gEy2x4Xgq+rq+auGx5I+nUDE2YYuqteGIlbxrwQXkIaYTfnQ==) seems to be corrupted. Trying again.
#14 7.913 npm ERR! code EINTEGRITY
#14 7.919 npm ERR! sha512-ad4u4I88X9AcUgxCRW3RLnbh7xHWQ1f5HbrXa7gEy2x4Xgq+rq+auGx5I+nUDE2YYuqteGIlbxrwQXkIaYTfnQ== integrity checksum failed when using sha512: wanted sha512-ad4u4I88X9AcUgxCRW3RLnbh7xHWQ1f5HbrXa7gEy2x4Xgq+rq+auGx5I+nUDE2YYuqteGIlbxrwQXkIaYTfnQ== but got sha512-GWBlkDNYgpkQElS+zGyIe1CN/XJxdEFuguLHOEGLZOIoDiH4cC9chggBwZsPK/Ls9nPikTzMuRDWfLzoGlKiRw==. (72989 bytes)
@mdouglass It's affecting anything that pins dependencies on Github by checksum
@mdouglass It's affecting anything that pins dependencies on Github by checksum
Yep, my point was more that it was still happening after the supposed rollback
Yeah. https://github.com/bazelbuild/rules_foreign_cc/archive/0.8.0.tar.gz was 6041f1374ff32ba711564374ad8e007aef77f71561a7ce784123b9b4b88614fc but it's still generating an archive that matches the same changed hash as earlier today (2fe52e77e11dc51b26e0af5834ac490699cfe6654c7c22ded55e092f0dd5fe57).
Will this issue continue to be used for status updates on the rollback?
I still don't see these 2 examples (there are a lot more) going back to what bazel rules expect:
curl -L https://github.com/bazelbuild/rules_python/archive/refs/tags/0.8.0.tar.gz | sha256sum
curl -L https://github.com/google/go-containerregistry/archive/v0.5.1.tar.gz | sha256sum
bazel rules are expecting 9fcf91dbcc31fde6d1edb15f117246d912c33c36f44cf681976bd886538deba6
& c3e28d8820056e7cc870dbb5f18b4f7f7cbd4e1b14633a6317cef895fdb35203
, but we are still getting 5c619c918959d209abd203a63e4b89d26dea8f75091e26f33f719ab52097ef68
& 3f56ff9d903d76e760620669949ddaee8760e51093f9c2913786c85242075fda
Seeing at least one correct hash now
$ curl -sL https://github.com/madler/zlib/archive/v1.2.11.tar.gz | sha256sum
629380c90a77b964d896ed37163f5c3a34f6e6d897311f1df2a7016355c45eff -
@fishy yours seem correct now too
$ curl -sL https://github.com/bazelbuild/rules_python/archive/refs/tags/0.8.0.tar.gz | sha256sum; curl -sL https://github.com/google/go-containerregistry/archive/v0.5.1.tar.gz | sha256sum
9fcf91dbcc31fde6d1edb15f117246d912c33c36f44cf681976bd886538deba6 -
c3e28d8820056e7cc870dbb5f18b4f7f7cbd4e1b14633a6317cef895fdb35203 -
I think the rollback is live now, some of my conan recipes started t work again :partying_face:
~/git/mesonbuild/wrapdb] $ wget https://github.com/abseil/abseil-cpp/archive/20220623.0.tar.gz
~/git/mesonbuild/wrapdb] $ sha256sum abseil-cpp-20220623.0.tar.gz subprojects/packagecache/abseil-cpp-20220623.0.tar.gz
4208129b49006089ba1d6710845a45e31c59b0ab6bff9e5788a87f55c5abd602 abseil-cpp-20220623.0.tar.gz
4208129b49006089ba1d6710845a45e31c59b0ab6bff9e5788a87f55c5abd602 subprojects/packagecache/abseil-cpp-20220623.0.tar.gz
My original testcase started working too (the subprojects/packagecache/ directory is my local copy from August 2022 of the archive that a contributor posted a ticket about at https://github.com/mesonbuild/wrapdb/pull/884).
I'm seeing the expected hash on my rdkafka
download:
curl -sL https://github.com/confluentinc/librdkafka/archive/v1.8.2.tar.gz | sha256sum
6a747d293a7a4613bd2897e28e8791476fbe1ae7361f2530a876e0fd483482a6 -
You need to upload the tarball during the release creation. https://github.com/bazel-contrib/rules_cuda/issues/56#issuecomment-1367715605 We also experience checksum change somehow.
The template for rules has been changed to publish a tarball into the release instead of relying on GitHub to provide a stable interface. Ref https://github.com/bazel-contrib/rules-template/pull/44.
FYI there's an independent motivation to upload artifacts: https://github.com/bazelbuild/bazel_metrics/issues/4
Rules ought to distribute an artifact that doesn't contain references to development-time dependencies, and omits testing code and examples.
This means the distribution can be broken if files are missing.
In addition, rules ought to integration-test against all supported bazel versions. So there should be some bazel-in-bazel test that consumes the HEAD distribution artifact and tests that the examples work.
Right now there are a few ways. rules_nodejs and rules_python have a built-in integration test runner. rules_go has a special go_bazel_test rule.