bazel-contrib / SIG-rules-authors

Governance and admin for the rules authors Special Interest Group
https://bazel-contrib.github.io/SIG-rules-authors/
Apache License 2.0
29 stars 12 forks source link

Should rulesets distribute a pre-built artifact rather than rely on GitHub source/release archive #11

Closed alexeagle closed 2 years ago

alexeagle commented 2 years ago

Rules ought to distribute an artifact that doesn't contain references to development-time dependencies, and omits testing code and examples.

This means the distribution can be broken if files are missing.

In addition, rules ought to integration-test against all supported bazel versions. So there should be some bazel-in-bazel test that consumes the HEAD distribution artifact and tests that the examples work.

Right now there are a few ways. rules_nodejs and rules_python have a built-in integration test runner. rules_go has a special go_bazel_test rule.

nekopsykose commented 1 year ago

While we are updating our SHAs, we should probably just migrate to Gitlab, right?

anecdotal and i cannot point to anything specific currently, but i have seen some "compression changed on generated tarballs" on gitlab too (between some updates). i don't think any of these systems are designed with a perfect guarantee for this ("this" being hash-stable tarballs), we just got here by circumstance.

joer14 commented 1 year ago

While we are updating our SHAs, we should probably just migrate to Gitlab, right? Is that the current recommendation for people who want more reliability?

fmeum commented 1 year ago

@bk2204 https://support.github.com/ticket/personal/0/1485189 (not visible publicly) made a clear commitment that the /archive/refs/tags/$tag endpoint would provide archives with stable hashes and should be relied upon for that purpose. I specifically asked for confirmation of this twice and received it. Happy to share the full conversation I had with support.

vtbassmatt commented 1 year ago

Hey folks. I'm the product manager for Git at GitHub. We're sorry for the breakage, we're reverting the change, and we'll communicate better about such changes in the future (including timelines).

nektro commented 1 year ago

did GitHub remove the staff tag from profiles? there's allegedly like 4 staff members in this thread and no one has the badge

vtbassmatt commented 1 year ago

We updated our Git version which made this change for the reasons explained. At the time we didn't foresee the impact. We're quickly rolling back the change now, as it's clear we need to look at this more closely to see if we can make the changes in a less disruptive way. Thanks for letting us know.

Also re: Staff badge, here's what I see in this thread:

image
archoversight commented 1 year ago

Meta, @vtbassmatt this is what we normal users see.

Screenshot 2023-01-30 at 3 24 25 PM

vtbassmatt commented 1 year ago

this is what normal users see

Huh! I don't work on frontend stuff, so that's a mystery to me 😅

jerrymarino commented 1 year ago

@vtbassmatt awesome thank you kindly ❤️

Will github provide stability guarantees around the non release tarball / zip URLs going forward?

f0rmiga commented 1 year ago

Thank you, @vtbassmatt. May I suggest a regression test for this?

jmacdonald-Ocient commented 1 year ago

@vtbassmatt For those of us who dutifully updated our checksums in response to the original change, can you give us a timeline for the rollback so we can try to time rolling back our updates? I totally understand we are in the minority and rolling back the change is the right move, but of course the new interface was live, Hyrum's law and all that.

vtbassmatt commented 1 year ago

@jerrymarino too soon to commit to anything in particular (except "we will DEFINITELY communicate better"). There are good reasons on both sides.

@jmacdonald-Ocient the rollback is imminent, just winding its way through automated testing. I don't know for sure how long it will take to show up, I'm sorry.

aneeshusa commented 1 year ago

Thanks GitHub staff for the quick response, looking forward to the follow-up communication. Idea on better communication going forward - could you please add a hint in the UI right by the download links that links to docs on what is/isn't stable and possibly best practices on getting stable artifacts and checksumming them? e.g. a help icon you can hover over with a tooltip or that links to the docs.

IME this form of contextual help/communication is really beneficial for customers that may not follow the blog, think to search the docs, etc. as it's right in the point of use.

vlovich commented 1 year ago

If the checksum isn't stable, after the community is migrated, I would recommend that a random value is injected every time to really drive this point home so that no one reacquires an incorrect dependency. Hyrum's Law shows that documenting an interface as unstable is insufficient if in practice it's largely stable.

ilanKeshet commented 1 year ago

Hey folks. I'm the product manager for Git at GitHub. We're sorry for the breakage, we're reverting the change, and we'll communicate better about such changes in the future (including timelines).

Oh wow, what a wild ride, i'm so glad you are reverting it 🙏

Please note that not everything can be migrated to pointing at a release package easily, a lot of the checksum errors I've experienced were in third party plugin/rule code we have no direct control over.

Some pointing to on-the-fly tar.gz generated source archive from specific historical revisions and such. obviously we'd do our part in migrating away and upgrading such dependencies but still this is quite a genuine threat and one that is hard to validate.

please take such concerns under consideration when rolling out a solution

vtbassmatt commented 1 year ago

Those files are generated new each time (with some caching - an hour I think). We told Git to use the old settings instead of its new default, so they’ll start getting generated with the old hashes again. I’m told the roll-out is complete, modulo resetting those caches.

jfirebaugh commented 1 year ago

FWIW, we're seeing more checksum mismatches in the last ~20 minutes than at any other time today.

Is it possible you invalidated a cache that was preventing some portion of artifacts from being regenerated, and now they are being regenerated before the rollback was complete?

vtbassmatt commented 1 year ago

@jfirebaugh that is indeed possible, we’ll look into it. Is it abating for you or still ongoing?

jfirebaugh commented 1 year ago

Ongoing

gregorycollins commented 1 year ago

Here too -- all of our bazel builds died about four hours ago. I've been trying to band-aid it by updating hashes / moving repos to git_repository() from http_archive() but we are still seeing lots of issues since this incorrect use of http_archive() is pervasive in bazel-land

fishy commented 1 year ago

FWIW, we're seeing more checksum mismatches in the last ~20 minutes than at any other time today.

Is it possible you invalidated a cache that was preventing some portion of artifacts from being regenerated, and now they are being regenerated before the rollback was complete?

Echoing this, we also see things getting worse in the past ~20min.

MkJAS commented 1 year ago

Still having issues with vcpkg and installing libs such as boost and yaml-cpp. Believe this was a related issue.

vtbassmatt commented 1 year ago

Think we found a bug in the rollback. Being fixed now.

smocherla-brex commented 1 year ago

A statuspage would have been easier to follow than updates on a github issue.

duke-cliff commented 1 year ago

In our case, we had rules_python mismatch which was fixed and everything worked for awhile. But now, we are getting rules_java mismatch and everything stops working.

/root/.cache/bazel/_bazel_root/2d3430b48bd77b69b91ab356ef9daf21/external/rules_java/temp5602810203644143040/981f06c3d2bd10225e85209904090eb7b5fb26bd.tar.gz: Checksum was 01581e873735f018321d84d2e03403f37f4b05cf19d9c8036f14466e584a46b9 but wanted f5a3e477e579231fca27bf202bb0e8fbe4fc6339d63b38ccb87c2760b533d1c3
sebastianv1 commented 1 year ago

Think we found a bug in the rollback. Being fixed now.

Any update on this fix?

eli-schwartz commented 1 year ago

I have been thinking about this problem for a while, "safe and comfortable in the knowledge that it will never break". :rofl:

So one good output of this actually occurring and then being reverted: I have gone ahead and actually posted an email to the git mailing list about the possible solution I've been thinking of for a while now: https://public-inbox.org/git/a812a664-67ea-c0ba-599f-cb79e2d96694@gmail.com/T/

I live in hope that we'll eventually see a world where the manpage for git-archive says "git archive is reproducible and here is why", and then no one ever has to have this debate again.

vtbassmatt commented 1 year ago

Any update on this fix?

Should be deployed now. I spoke too soon. It’s in progress but not fully out.

sebastianv1 commented 1 year ago

Should be deployed now.

I'm still seeing the broken checksum values . Does this rollout also requiring waiting an hour for the caches to reset?

EDIT:

I spoke too soon. It’s in progress but not fully out.

Clicked respond right before I saw the edit. Thanks for the update!

ashleydavies commented 1 year ago

I agree with the person above saying a status page would be better than comment updates, but I think it's important to note that it's still appreciated regardless - it's vastly more helpful than radio silence, which a lot of companies and teams would be giving in a similar position right now. Thanks for keeping us updated!

pcj commented 1 year ago

@eli-schwartz Posted to HN: https://news.ycombinator.com/item?id=34588880

vtbassmatt commented 1 year ago

Sorry for the false starts above, and I appreciate everyone’s patience with me. You should start seeing the old checksums now.

duke-cliff commented 1 year ago

@vtbassmatt How does the rollback work? Do you need to literally re-generate all the affected releases which would take a long time to finish?

gregorycollins commented 1 year ago

You should start seeing the old checksums now.

@vtbassmatt do we have to wait for a cache eviction? Still seeing bad hashes here

$ curl -sL https://github.com/madler/zlib/archive/v1.2.11.tar.gz | sha256sum
9917faf62bc29a578c79a05c07ca949cfda6e50a1c8a02db6ac30c5ea0aba7c0  -

(Bazel thinks this is supposed to be 629380c90a77b964d896ed37163f5c3a34f6e6d897311f1df2a7016355c45eff)

gumho commented 1 year ago

Doesn't look like the rollback is complete. For example, https://github.com/bazelbuild/rules_python/archive/refs/tags/0.16.1.tar.gz (https://github.com/bazelbuild/rules_python/releases/tag/0.16.1) still has the wrong (newer) checksum.

vtbassmatt commented 1 year ago

Thanks for the ping. This is unexpected and folks are looking at it immediately. I’ve got to step out of the thread now, but we do intend to revert to previous behavior.

mdouglass commented 1 year ago

@vtbassmatt I am having similar issues with GitHub Actions builds that are using npm to grab resources from GitHub. This is from about 1m ago

#14 7.808 npm WARN tarball tarball data for http2@https://github.com/node-apn/node-http2/archive/apn-2.1.4.tar.gz (sha512-ad4u4I88X9AcUgxCRW3RLnbh7xHWQ1f5HbrXa7gEy2x4Xgq+rq+auGx5I+nUDE2YYuqteGIlbxrwQXkIaYTfnQ==) seems to be corrupted. Trying again.
#14 7.913 npm ERR! code EINTEGRITY
#14 7.919 npm ERR! sha512-ad4u4I88X9AcUgxCRW3RLnbh7xHWQ1f5HbrXa7gEy2x4Xgq+rq+auGx5I+nUDE2YYuqteGIlbxrwQXkIaYTfnQ== integrity checksum failed when using sha512: wanted sha512-ad4u4I88X9AcUgxCRW3RLnbh7xHWQ1f5HbrXa7gEy2x4Xgq+rq+auGx5I+nUDE2YYuqteGIlbxrwQXkIaYTfnQ== but got sha512-GWBlkDNYgpkQElS+zGyIe1CN/XJxdEFuguLHOEGLZOIoDiH4cC9chggBwZsPK/Ls9nPikTzMuRDWfLzoGlKiRw==. (72989 bytes)
vardaro commented 1 year ago

@mdouglass It's affecting anything that pins dependencies on Github by checksum

mdouglass commented 1 year ago

@mdouglass It's affecting anything that pins dependencies on Github by checksum

Yep, my point was more that it was still happening after the supposed rollback

vlovich commented 1 year ago

Yeah. https://github.com/bazelbuild/rules_foreign_cc/archive/0.8.0.tar.gz was 6041f1374ff32ba711564374ad8e007aef77f71561a7ce784123b9b4b88614fc but it's still generating an archive that matches the same changed hash as earlier today (2fe52e77e11dc51b26e0af5834ac490699cfe6654c7c22ded55e092f0dd5fe57).

vardaro commented 1 year ago

Will this issue continue to be used for status updates on the rollback?

fishy commented 1 year ago

I still don't see these 2 examples (there are a lot more) going back to what bazel rules expect:

curl -L https://github.com/bazelbuild/rules_python/archive/refs/tags/0.8.0.tar.gz | sha256sum
curl -L https://github.com/google/go-containerregistry/archive/v0.5.1.tar.gz | sha256sum

bazel rules are expecting 9fcf91dbcc31fde6d1edb15f117246d912c33c36f44cf681976bd886538deba6 & c3e28d8820056e7cc870dbb5f18b4f7f7cbd4e1b14633a6317cef895fdb35203, but we are still getting 5c619c918959d209abd203a63e4b89d26dea8f75091e26f33f719ab52097ef68 & 3f56ff9d903d76e760620669949ddaee8760e51093f9c2913786c85242075fda

gregorycollins commented 1 year ago

Seeing at least one correct hash now

$ curl -sL https://github.com/madler/zlib/archive/v1.2.11.tar.gz | sha256sum
629380c90a77b964d896ed37163f5c3a34f6e6d897311f1df2a7016355c45eff  -

@fishy yours seem correct now too

$ curl -sL https://github.com/bazelbuild/rules_python/archive/refs/tags/0.8.0.tar.gz | sha256sum; curl -sL https://github.com/google/go-containerregistry/archive/v0.5.1.tar.gz | sha256sum
9fcf91dbcc31fde6d1edb15f117246d912c33c36f44cf681976bd886538deba6  -
c3e28d8820056e7cc870dbb5f18b4f7f7cbd4e1b14633a6317cef895fdb35203  -
ericriff commented 1 year ago

I think the rollback is live now, some of my conan recipes started t work again :partying_face:

eli-schwartz commented 1 year ago
~/git/mesonbuild/wrapdb] $ wget https://github.com/abseil/abseil-cpp/archive/20220623.0.tar.gz
~/git/mesonbuild/wrapdb] $ sha256sum abseil-cpp-20220623.0.tar.gz subprojects/packagecache/abseil-cpp-20220623.0.tar.gz 
4208129b49006089ba1d6710845a45e31c59b0ab6bff9e5788a87f55c5abd602  abseil-cpp-20220623.0.tar.gz
4208129b49006089ba1d6710845a45e31c59b0ab6bff9e5788a87f55c5abd602  subprojects/packagecache/abseil-cpp-20220623.0.tar.gz

My original testcase started working too (the subprojects/packagecache/ directory is my local copy from August 2022 of the archive that a contributor posted a ticket about at https://github.com/mesonbuild/wrapdb/pull/884).

haikuginger commented 1 year ago

I'm seeing the expected hash on my rdkafka download:

curl -sL https://github.com/confluentinc/librdkafka/archive/v1.8.2.tar.gz | sha256sum
6a747d293a7a4613bd2897e28e8791476fbe1ae7361f2530a876e0fd483482a6  -
aherrmann commented 1 year ago

@vtbassmatt Thank you for handling this issues and reverting the change.

Could you clarify whether the committment to stable release artifacts, as mentioned here and here will be upheld going forward after this revert, or whether it may still change in the future?

cloudhan commented 1 year ago

You need to upload the tarball during the release creation. https://github.com/bazel-contrib/rules_cuda/issues/56#issuecomment-1367715605 We also experience checksum change somehow.

f0rmiga commented 1 year ago

The template for rules has been changed to publish a tarball into the release instead of relying on GitHub to provide a stable interface. Ref https://github.com/bazel-contrib/rules-template/pull/44.

alexeagle commented 1 year ago

FYI there's an independent motivation to upload artifacts: https://github.com/bazelbuild/bazel_metrics/issues/4