Fail to send report - Githubissues

NobodyXu commented 1 year ago

 WARN Failed to send quickinstall report for package cargo-binstall-0.20.1-universal-apple-darwin: Failed 
to download from remote: could not HEAD https://warehouse-clerk-tmp.vercel.app/api/crate/cargo-binstall-0.
20.1-universal-apple-darwin.tar.gz: HTTP status client error (403 Forbidden) for url (https://warehouse-cl
erk-tmp.vercel.app/api/crate/cargo-binstall-0.20.1-universal-apple-darwin.tar.gz)

NobodyXu commented 1 year ago

I suppose that #164 might fix this.

NobodyXu commented 1 year ago

@alsuren I've been getting a lot of these errors recently on my MacBook. Can you check the stats server please?

Is it out of resource or down?

alsuren commented 1 year ago

Last rate limit email I got from upstash was the 10th.

I still haven't set up forwarding for them. Can't do it on my phone though.

The stats reporting should be forked off in the background. If reporting fails then you should continue on and ignore it. It should not be printing whether it passed out failed by default.

On Tue, 14 Mar 2023, 09:54 Jiahao XU, @.***> wrote:

@alsuren https://github.com/alsuren I've been getting a lot of these errors recently on my MacBook. Can you check the stats server please?

Is it out of resource or down?

— Reply to this email directly, view it on GitHub https://github.com/cargo-bins/cargo-quickinstall/issues/195#issuecomment-1467663419, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB6FN32AI37WPBKPXCDTN3W4AW3FANCNFSM6AAAAAAVYP6KIA . You are receiving this because you were mentioned.Message ID: @.***>

NobodyXu commented 1 year ago

The stats reporting should be forked off in the background. If reporting fails then you should continue on and ignore it.

That's exactly how cargo-binstall does the report.

It should not be printing whether it passed out failed by default.

It's just a warning from cargo-binstall on failure.

alsuren commented 1 year ago

universal-apple-darwin is not in the list in https://github.com/alsuren/warehouse-clerk-tmp/blob/master/pages/api/crate/%5Btarball%5D.ts

Is it a new thing?

You can also get the body of the response and print it if you want more info (I'm currently on my phone, but I can guess what it would say)

NobodyXu commented 1 year ago

Yes, it's a new target we add for cargo-binstall.

Currently the upstream has not added official support yet and it seems that they are not planning to add a new target for this, but I also don't think they will use universal-apple-darwin for anything else, so we use that name in cargo-binstall for universal MacOS binaries that includes both x86_64 and aarch64 binaries.

https://github.com/rust-lang/cargo/issues/8875

alsuren commented 1 year ago

Ah.

I think that binstall should probably not look in the quickinstall archives for this architecture then. If it is not in rustup target list (or however you spell it) then quickinstall will never have a built package for it.

Relatedly: I noticed in the stats that binstall often logs requests for a handful of architectures (normally glibc and musl). Do you try them all in parallel or each in series (once you failed to download a more preferred arch)?

On Wed, 15 Mar 2023, 05:47 Jiahao XU, @.***> wrote:

Yes, it's a new target we add for cargo-binstall.

Currently the upstream has not added official support yet and it seems that they are not planning to add a new target for this, but I also don't think they will use universal-apple-darwin for anything else, so we use that name in cargo-binstall for universal MacOS binaries that includes both x86_64 and aarch64 binaries.

rust-lang/cargo#8875 https://github.com/rust-lang/cargo/issues/8875

— Reply to this email directly, view it on GitHub https://github.com/cargo-bins/cargo-quickinstall/issues/195#issuecomment-1469324130, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB6FNY4RAPD57UNKWZUE6LW4FCXDANCNFSM6AAAAAAVYP6KIA . You are receiving this because you were mentioned.Message ID: @.***>

NobodyXu commented 1 year ago

Ah.

I think that binstall should probably not look in the quickinstall archives for this architecture then. If it is not in rustup target list (or however you spell it) then quickinstall will never have a built package for it.

Got it, I can add this as a special case to our QuickInstall fetcher since this is the only target we invented.

Relatedly: I noticed in the stats that binstall often logs requests for a handful of architectures (normally glibc and musl). Do you try them all in parallel or each in series (once you failed to download a more preferred arch)?

cargo-binstall's resolution process is actually made up of two parts for each crate:

Call Fetcher::find on each fetcher for each target, which returns an AutoAbortJoinHandle, a new type over tokio::task::JoinHandle that cancel the task on drop, collect the handles into a Vec
Iterate over the Vec, await on the handle. If it returns error, propagate. If it returns Ok(true), then call Fetcher::download and check whether the archive contains all binaries required (all binaries not gated behind any feature). If it contains all the bins required, resolution done. Otherwise, on Ok(false) or missing bins, proceed the iteration.

The first process is done in parallel and the second is done in serial.

The quickinstall report is done in ::find, which is why binstall logs for both glibc and musl.

I could add a new API Fetcher::report_failure that will be called if Fetcher::find returns Ok(false) on the second step.

alsuren commented 1 year ago

That makes sense. It also explains why you blast through your GitHub rate limit so quickly.

Brain dump of unrelated ideas follows:

A) When the list of requested packages*fetchers is long, it might be interesting to limit the concurrency a bit, by

making a list containing the preferred fetcher for all packages first, followed by the next best fetcher for each package etc
Pulling from it with bounded concurrency primitive like stream::buffered https://docs.rs/futures/latest/futures/stream/trait.StreamExt.html#method.buffered or a semaphore or workerpool
Picking the first result from the list for each package and dropping the rest (some of which wouldn't have even been spawned yet, of they are far enough down the list)

(You would also need a mechanism for a successful fetcher search step for $package to cancel any lower priority fetchers for that package)

B) if each quickinstall fetcher was run strictly in series then its search step could be a noop (always return "maybe") and the download step could do the fetch of the package tarball at that point only (and also report stats at this point). This would drastically reduce the number of http fetches you make to GitHub releases, because you don't do any of the HEAD requests, and in the happy case you only do one GET.

C) I would find it really interesting if we could report which fetcher+arch binstall used for each package. The new stats server is much easier to extend than the old one (just add arbitrary query params and they will be added to the influxdb stats). It might be better to delay stats reporting until after the find calls have resolved, and then send a single report. We can design this properly once I have finished building it (no promises about when that will happen though, because I'm a bit snowed under at the moment with work and life things).

On Wed, 15 Mar 2023, 09:00 Jiahao XU, @.***> wrote:

Ah.

I think that binstall should probably not look in the quickinstall archives for this architecture then. If it is not in rustup target list (or however you spell it) then quickinstall will never have a built package for it.

Got it, I can add this as a special case to our QuickInstall fetcher since this is the only target we invented.

Relatedly: I noticed in the stats that binstall often logs requests for a handful of architectures (normally glibc and musl). Do you try them all in parallel or each in series (once you failed to download a more preferred arch)?

cargo-binstall's resolution process is actually made up of two parts for each crate:

Call Fetcher::find on each fetcher for each target, which returns an AutoAbortJoinHandle, a new type over tokio::task::JoinHandle that cancel the task on drop, collect the handles into a Vec

Iterate over the Vec, await on the handle. If it returns error, propagate. If it returns Ok(true), then call Fetcher::download and check whether the archive contains all binaries required (all binaries not gated behind any feature). If it contains all the bins required, resolution done. Otherwise, on Ok(false) or missing bins, proceed the iteration.

The first process is done in parallel and the second is done in serial.

The quickinstall report is done in ::find, which is why binstall logs for both glibc and musl.

I could add a new API Fetcher::report_failure that will be called if Fetcher::find returns Ok(false) on the second step.

— Reply to this email directly, view it on GitHub https://github.com/cargo-bins/cargo-quickinstall/issues/195#issuecomment-1469524686, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB6FN54ACQYZ6UESWX6M4LW4FZIXANCNFSM6AAAAAAVYP6KIA . You are receiving this because you were mentioned.Message ID: @.***>

NobodyXu commented 1 year ago

making a list containing the preferred fetcher for all packages first, followed by the next best fetcher for each package etc

Pulling from it with bounded concurrency primitive like stream::buffered https://docs.rs/futures/latest/futures/stream/trait.StreamExt.html#method.buffered or a semaphore or workerpool

Thanks, that's a good advice though I'm not sure how to implement this. At the end of the day they are all just futures polled by tokio.

Our current solution of using GitHub Restful API already reduces the http requests created for GhCrateMeta to at most 2 per crate, due to trying different release tag in automatic discovery, but if the pkg-url is override, then it could be done using only 1 http request per crate.

For QuickInstall, it only creates one http request to GitHub per crate but would still hit the stats server with multiple http requests, which I will later submit a PR to delay report after Fetcher::find is called.

Picking the first result from the list for each package and dropping the rest (some of which wouldn't have even been spawned yet, of they are far enough down the list)

(You would also need a mechanism for a successful fetcher search step for $package to cancel any lower priority fetchers for that package)

We already have this mechanism in place using AutoAbortJoinHandle, which cancels the spawned task on drop.

B) if each quickinstall fetcher was run strictly in series then its search step could be a noop (always return "maybe") and the download step could do the fetch of the package tarball at that point only (and also report stats at this point). This would drastically reduce the number of http fetches you make to GitHub releases, because you don't do any of the HEAD requests, and in the happy case you only do one GET.

With GitHub Restful API client in place, we don't need to run it in serial anymore since one API call to GitHub would return all release artifacts in one http call for the entire crate.

It might be better to delay stats reporting until after the find calls have resolved, and then send a single report.

Yes this is exactly what I am going to do.

cargo-bins / cargo-quickinstall

Fail to send report #195