cargo-bins / cargo-quickinstall

pre-compiled binary packages for `cargo install`
Apache License 2.0
212 stars 10 forks source link

402 Payment Required #268

Open paul-hansen opened 3 weeks ago

paul-hansen commented 3 weeks ago

Edit: This warning is just regarding telemetry and any longer build times are unrelated. For me it was building from source due to a new version of the crate just being released.

Getting this warning just now in our CI

Run cargo binstall -y --force cargo-leptos
 INFO resolve: Resolving package: 'cargo-leptos'
 WARN Failed to send quickinstall report for package cargo-leptos-0.2.19-x86_64-unknown-linux-gnu: Failed to download from remote: could not HEAD https://warehouse-clerk-tmp.vercel.app/api/crate/cargo-leptos-0.2.19-x86_64-unknown-linux-gnu.tar.gz: HTTP status client error (402 Payment Required) for url (https://warehouse-clerk-tmp.vercel.app/api/crate/cargo-leptos-0.2.19-x86_64-unknown-linux-gnu.tar.gz)
 WARN Failed to send quickinstall report for package cargo-leptos-0.2.19-x86_64-unknown-linux-musl: Failed to download from remote: could not HEAD https://warehouse-clerk-tmp.vercel.app/api/crate/cargo-leptos-0.2.19-x86_64-unknown-linux-musl.tar.gz: HTTP status client error (402 Payment Required) for url (https://warehouse-clerk-tmp.vercel.app/api/crate/cargo-leptos-0.2.19-x86_64-unknown-linux-musl.tar.gz)

It then builds it from source. It worked earlier today without this error.

NobodyXu commented 3 weeks ago

Thanks for reporting!

cc @alsuren Did we hit the rate limit of stats collection?

alsuren commented 3 weeks ago

Yeah, vercel said we hit our http handler time limit. We could probably talk to vercel and get that bumped.

NobodyXu commented 3 weeks ago

Thank you!

That's good to hear!

paul-hansen commented 3 weeks ago

Seems to be working again, closing. Thanks!

Feel free to reopen if you want to use this issue as a reminder to ask them to bump the limit if you haven't yet or anything.

jimehk commented 3 weeks ago

I just started getting the 402 Payment Required errors again.

CleanShot 2024-08-23 at 16 41 39@2x

s373r commented 2 weeks ago

Also ran into this problem. A clipping of logs that you may find useful:

...
+ ./cargo-binstall -y --force cargo-binstall
 INFO resolve: Resolving package: 'cargo-binstall'
 WARN Failed to send quickinstall report for package cargo-binstall-1.10.3-x86_64-apple-darwin: Failed to download from remote: could not HEAD https://warehouse-clerk-tmp.vercel.app/api/crate/cargo-binstall-1.10.3-x86_64-apple-darwin.tar.gz: HTTP status client error (402 Payment Required) for url (https://warehouse-clerk-tmp.vercel.app/api/crate/cargo-binstall-1.10.3-x86_64-apple-darwin.tar.gz)
 INFO has_release_artifact{release=GhRelease { repo: GhRepo { owner: "cargo-bins", repo: "cargo-binstall" }, tag: "v1.10.3" } artifact_name="cargo-binstall-x86_64-apple-darwin.zip"}:do_send_request{request=Request { method: GET, url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("api.github.com")), port: None, path: "/repos/cargo-bins/cargo-binstall/releases/tags/v1.10.3", query: None, fragment: None }, headers: {"accept": "application/vnd.github+json", "x-github-api-version": "2022-11-28"} } url=[https://api.github.com/repos/cargo-bins/cargo-binstall/releases/tags/v1.10.3}:](https://api.github.com/repos/cargo-bins/cargo-binstall/releases/tags/v1.10.3%7D:) Received status code 403 Forbidden, will wait for 120s and retry
 INFO has_release_artifact{release=GhRelease { repo: GhRepo { owner: "cargo-bins", repo: "cargo-binstall" }, tag: "v1.10.3" } artifact_name="cargo-binstall-x86_64-apple-darwin.zip"}:do_send_request{request=Request { method: GET, url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("api.github.com")), port: None, path: "/repos/cargo-bins/cargo-binstall/releases/tags/v1.10.3", query: None, fragment: None }, headers: {"accept": "application/vnd.github+json", "x-github-api-version": "2022-11-28"} } url=[https://api.github.com/repos/cargo-bins/cargo-binstall/releases/tags/v1.10.3}:](https://api.github.com/repos/cargo-bins/cargo-binstall/releases/tags/v1.10.3%7D:) Received status code 403 Forbidden, will wait for 120s and retry
 WARN resolve: Timeout reached while checking fetcher invalid url: deadline has elapsed
 INFO has_release_artifact{release=GhRelease { repo: GhRepo { owner: "cargo-bins", repo: "cargo-binstall" }, tag: "v1.10.3" } artifact_name="cargo-binstall-universal-apple-darwin.tbz2"}:do_send_request{request=Request { method: GET, url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("api.github.com")), port: None, path: "/repos/cargo-bins/cargo-binstall/releases/tags/v1.10.3", query: None, fragment: None }, headers: {"accept": "application/vnd.github+json", "x-github-api-version": "2022-11-28"} } url=[https://api.github.com/repos/cargo-bins/cargo-binstall/releases/tags/v1.10.3}:](https://api.github.com/repos/cargo-bins/cargo-binstall/releases/tags/v1.10.3%7D:) Received status code 403 Forbidden, will wait for 120s and retry
 WARN resolve: Timeout reached while checking fetcher invalid url: deadline has elapsed
 INFO has_release_artifact{release=GhRelease { repo: GhRepo { owner: "cargo-bins", repo: "cargo-binstall" }, tag: "v1.10.3" } artifact_name="cargo-binstall-universal2-apple-darwin.tbz2"}:do_send_request{request=Request { method: GET, url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("api.github.com")), port: None, path: "/repos/cargo-bins/cargo-binstall/releases/tags/v1.10.3", query: None, fragment: None }, headers: {"accept": "application/vnd.github+json", "x-github-api-version": "2022-11-28"} } url=[https://api.github.com/repos/cargo-bins/cargo-binstall/releases/tags/v1.10.3}:](https://api.github.com/repos/cargo-bins/cargo-binstall/releases/tags/v1.10.3%7D:) Received status code 403 Forbidden, will wait for 120s and retry
 INFO has_release_artifact{release=GhRelease { repo: GhRepo { owner: "cargo-bins", repo: "cargo-binstall" }, tag: "v1.10.3" } artifact_name="cargo-binstall-universal2-apple-darwin.tbz2"}:do_send_request{request=Request { method: GET, url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("api.github.com")), port: None, path: "/repos/cargo-bins/cargo-binstall/releases/tags/v1.10.3", query: None, fragment: None }, headers: {"accept": "application/vnd.github+json", "x-github-api-version": "2022-11-28"} } url=[https://api.github.com/repos/cargo-bins/cargo-binstall/releases/tags/v1.10.3}:](https://api.github.com/repos/cargo-bins/cargo-binstall/releases/tags/v1.10.3%7D:) Received status code 403 Forbidden, will wait for 120s and retry
 WARN resolve: Timeout reached while checking fetcher invalid url: deadline has elapsed
 INFO has_release_artifact{release=GhRelease { repo: GhRepo { owner: "cargo-bins", repo: "cargo-quickinstall" }, tag: "cargo-binstall-1.10.3" } artifact_name="cargo-binstall-1.10.3-x86_64-apple-darwin.tar.gz"}:do_send_request{request=Request { method: GET, url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("api.github.com")), port: None, path: "/repos/cargo-bins/cargo-quickinstall/releases/tags/cargo-binstall-1.10.3", query: None, fragment: None }, headers: {"accept": "application/vnd.github+json", "x-github-api-version": "2022-11-28"} } url=[https://api.github.com/repos/cargo-bins/cargo-quickinstall/releases/tags/cargo-binstall-1.10.3}:](https://api.github.com/repos/cargo-bins/cargo-quickinstall/releases/tags/cargo-binstall-1.10.3%7D:) Received status code 403 Forbidden, will wait for 120s and retry
 WARN resolve: Timeout reached while checking fetcher QuickInstall: deadline has elapsed
 WARN The package cargo-binstall v1.10.3 will be installed from source (with cargo)
...

(c) https://github.com/kamu-data/kamu-cli/actions/runs/10579077977/job/29310649600?pr=795#step:6:32

NobodyXu commented 2 weeks ago

Thanks, this is simply because we are hitting the rate limit of vercel

It's probably because cargo-binstall now reports for available every target on the machine.

paul-hansen commented 2 weeks ago

Thanks, this is simply because we are hitting the rate limit of vercel

It's probably because cargo-binstall now reports for available every target on the machine.

It seems like it's causing it to build from source instead of downloading a binary, is this expected? I haven't looked at the code yet, just you calling it "reporting" makes me wonder if it's something we could skip if it fails but still download a binary.

NobodyXu commented 2 weeks ago

It seems like it's causing it to build from source instead of downloading a binary, is this expected?

No it isn't, the message is a bit confusing, but a failing telemetry does not have any influence over resolution.

In this case it's due to a timeout, because binstall reaches the rate limit.

Providing a github-token would fix it.

paul-hansen commented 2 weeks ago

Ah, so I'm guessing I was seeing long compile times were just because there was a new version of cargo leptos so there wasn't a binary generated yet. I had assumed the warning was related like it couldn't let quickinstall know of the new version to build or something.

I'll add a note to the issue description to let users know it's just telemetry and any longer build times are unrelated.

Oliboy50 commented 2 weeks ago

what is worse than this issue, is that even getting the binary from source does not work either

 INFO has_release_artifact{release=GhRelease { repo: GhRepo { owner: "cargo-lambda", repo: "cargo-lambda" }, tag: "v1.3.0" } artifact_name="cargo-lambda-v1.3.0.aarch64-apple-darwin.tar.gz"}:do_send_request{request=Request { method: GET, url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("api.github.com")), port: None, path: "/repos/cargo-lambda/cargo-lambda/releases/tags/v1.3.0", query: None, fragment: None }, headers: {"accept": "application/vnd.github+json", "x-github-api-version": "2022-11-28"} } url=https://api.github.com/repos/cargo-lambda/cargo-lambda/releases/tags/v1.3.0}: Received status code 403 Forbidden, will wait for 120s and retry

 # waiting 120s....

 INFO has_release_artifact{release=GhRelease { repo: GhRepo { owner: "cargo-lambda", repo: "cargo-lambda" }, tag: "v1.3.0" } artifact_name="cargo-lambda-v1.3.0.aarch64-apple-darwin.tar.gz"}:do_send_request{request=Request { method: GET, url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("api.github.com")), port: None, path: "/repos/cargo-lambda/cargo-lambda/releases/tags/v1.3.0", query: None, fragment: None }, headers: {"accept": "application/vnd.github+json", "x-github-api-version": "2022-11-28"} } url=https://api.github.com/repos/cargo-lambda/cargo-lambda/releases/tags/v1.3.0}: Received status code 403 Forbidden, will wait for 120s and retry

 # waiting a bit more...

 WARN resolve: Timeout reached while checking fetcher invalid url: deadline has elapsed
NobodyXu commented 2 weeks ago

In this case you could either

Our default timeout is probably a bit too large, we should set it to something smaller, i.e. 30s

Oliboy50 commented 2 weeks ago

@NobodyXu thank you, it worked pretty well after setting a GITHUB_TOKEN env var 🙇

jimeh commented 2 weeks ago

Possibly a dumb question, as I'm not super familiar with exactly the kind of metrics that's collected via the vercel app. But since all downloads happen via GitHub releases, download counts can at least be fetched via GitHub's API for each release. Could be done as a scheduled GitHub action that just iterates over all releases and gathers the artifact download counts. admittedly with the number of releases you have, it could be a quite slow job to avoid hitting API rate limits.

¯\_(ツ)_/¯... Just throwing it out there incase it might be useful and/or spark some ideas.

NobodyXu commented 2 weeks ago

Hmmm that's an interesting idea.

What quickinstall needs though, is the software to be build that we aren't providing yet, and then build and provide pre-built for it.

We have a script for fetching popular crates from libs.rs, but it's not used in ci https://github.com/cargo-bins/cargo-quickinstall/blob/main/get-popular-crates.sh

I suppose we could use that instead

NobodyXu commented 1 week ago

https://github.com/cargo-bins/cargo-quickinstall/blob/48af7959974023af14eddfe357f0ddaa7a952414/next-unbuilt-package.sh#L73

Turns out that we do fetch popular crates from https://libs.rs

alsuren commented 1 week ago

Sorry for the delay in looking at this.

Looks like they paused my account a couple of weeks ago for repeatedly going over the free tier allowance. I assume this is because of a change in how spammy cargo-binstall is with its stats?

Screenshot 2024-09-04 at 11 40 51

I have sent $20 in vercel's direction to unblock things. The price appears to be per-user-per-month rather than usage-related. This means that if I want to let anyone else help with the ops side of a stats server on vercel, it would be another $20/month per user? This does not fill me with joy.

I will send a link to this thread to their support people and see whether they have an open source tier or something.

In the next month, I think we need to do at least one of the following:

1) get our stats volumes down to reasonable levels again i) could we delay reporting until we have a cache miss (and do it while we're compiling the crate)? 2) find some other hosting that allows multiple admins for debugging, or get vercel to fund us for that. i) if we're sticking with vercel, we should probably change the URL to something more on-brand? ii) I heard a rumour that cloudflare/fastly/... have open source hosting tiers, but I can't find the info right now. 3) make the stats reporting protocol and gathering code more maintainable i) I've been wanting to rewrite all of the cronjob code using #!/usr/bin/env -S cargo +nightly -Zscript or something for a while now) ii) We should really be using ?query=params or something rather than the tarball name for the http request, so we can encode more information (including sending multiple architectures in the same http request?) iii) We probably shouldn't be using redis for stats storage because encoding multiple facets in it is a right pain.

The influxdb setup that I have been playing about with is pretty good for this Screenshot 2024-09-04 at 12 15 43 ( for maintainers: https://eu-central-1-1.aws.cloud2.influxdata.com/orgs/69235d4f38c3e042/dashboards/0d9b2d5b2b13c000?lower=now%28%29+-+30d - reply here with your email address me if you want access)
iiii) Should the stats server and stats reporting code live in the same place? Should we move it all to the cargo-binstall repo or something?

NobodyXu commented 1 week ago

i) could we delay reporting until we have a cache miss (and do it while we're compiling the crate)?

That's definitely doable in cargo-binstall

NobodyXu commented 1 week ago

For the stats reporting part, I'm also working on using crates-io daily db snapshot.

I'm currently trying to put up a python script, using polars to do it.

Since it's all csv and we only care about top n popular binary crates, it should be definitely doable.

NobodyXu commented 1 week ago

Using data from https://static.crates.io/db-dump.tar.gz

I was able to write a python script for getting top 2000 popular binary crates:

# execute this in 20xx-xx-xx-xxxxxx/data/
import polars as pl

(
    pl.scan_csv("crate_downloads.csv")
    .join(pl.scan_csv("crates.csv").select("id", "name"), left_on="crate_id", right_on="id")
    .join(pl.scan_csv("default_versions.csv"), on="crate_id")
    .join(
        pl.scan_csv("versions.csv").select("id", "crate_id", "yanked", "bin_names"),
        left_on=("crate_id", "version_id"),
        right_on=("crate_id", "id"),
    )
    .filter(pl.col("bin_names") != "{}", pl.col("yanked") == "f")
    .sort(by="downloads", descending=True)
    .select("name")
    .head(2000)
    .collect(streaming=True)
)

It can be combined with a git pull --tags and then check if it's already built, crate exclusion list for specific target and the randomised pick part.

I think it'd be a pretty good replacement for existing telemetry?

NobodyXu commented 1 week ago

cc @alsuren how does that look to you?

It's true that crates-io does not collect target info, so perhaps we should just build it for every target we support.

alsuren commented 1 week ago

Looks good. I will try rewriting a bunch of this bash nonsense in python rather than rust. The crates.io dump is a bit huge, so I will make a weekly cronjob to dump the Sunday somewhere.

How do we feel about using uv for managing our python environment? I have had some success using it in another you project of mine: https://github.com/alsuren/sixdofone/pull/8/files

NobodyXu commented 1 week ago

The crates.io dump is a bit huge, so I will make a weekly cronjob to dump the Sunday somewhere.

I think daily cronjob makes more sense?

The crates.io dump is updated every day, with it we could avoid hitting crates.io API so often.

How do we feel about using uv for managing our python environment? I have had some success using it in another you project of mine: https://github.com/alsuren/sixdofone/pull/8/files

Using uv makes sense for me, though I'd like dependabot to be enabled it, and having some CI to ensure it works.