brioche-dev / brioche

A delicious package manager
https://brioche.dev
MIT License
351 stars 5 forks source link

Network errors when installing hellow world #52

Closed matklad closed 5 months ago

matklad commented 5 months ago

If I try

cargo run -r  -- install -r hello_world

I get a timeout:

        ERROR run_install:bake:bake_inner:run_bake: brioche_core::bake: error=Request error: error sending request for url (https://registry.brioche.dev/v0/blobs/85b898d9dda3158822f5a0c25e71bc5dceb83c06620b4a0f7bf40370a362556f.zst?brioche=0.1.0): error sending request for url (https://registry.brioche.dev/v0/blobs/85b898d9dda3158822f5a0c25e71bc5dceb83c06620b4a0f7bf40370a362556f.zst?brioche=0.1.0): operation timed out scope=Project { project_hash: ProjectHash(Hash("d0aeff98c15b839e2d99e04c023e84414c48e249eb6e2c695696685c42e3d355")), export: "default" } recipe_hash=39cf103288f270b8c774b4d9c5d5b4d10fff20c5da3faf8a0e2721257cc4773e recipe_kind=Process recipe_hash=39cf103288f270b8c774b4d9c5d5b4d10fff20c5da3faf8a0e2721257cc4773e recipe_kind=Process

If I manually bump in-code timeouts to 60s from 10s, I then get some error about "temporary DNS failure" (sadly, lost the exact text of error somewhere in the git history)

kylewlacy commented 5 months ago

Huh, interesting... I hadn't seen that particular error for the handful of times I've tested from a fresh environment (although I've only really tested on Ubuntu and my good home internet connection). I remember you mentioned that you were using NixOS, so I might see if I could reproduce through that, but I have a few questions if you feel comfortable answering:

All of the downloads are handled with Reqwest, and the blob fetches specifically don't have a global timeout set (they do have a connect timeout and a read timeout at least). The registry is hosted in Fly.io, but the blobs get redirected and served from Cloudflare R2

kylewlacy commented 5 months ago

Actually, now that I think about it, I only have Fly.io instances near Seattle, maybe I just need to expand to more Fly.io regions...

matklad commented 5 months ago

I am in Lisbon. I can curl the file manually, but it feels very slow for 4 kilobytes (about 10 seconds actually).

kylewlacy commented 5 months ago

Okay, I've come up with a few possible explanations, and I've also made a few changes to try and address them. Let me try and summarize:

Could you try curling the URL from before again, ideally twice in a row? (the first time will likely be a cold boot, and the second should then hit the already-running instance)


If it seems like things have improved, then the final test would be trying the hello world installation from scratch again (also with some extra debugging for good measure):

  1. Run chmod -R +w ~/.local/share/brioche && rm -rf ~/.local/share/brioche to remove all the locally-stored files
  2. Re-run the install command with env vars: BRIOCHE_LOG_OUTPUT='./brioche.log' BRIOCHE_LOG_DEBUG='[]=debug' brioche install -r hello_world

If that still fails, then brioche.log should at least give some insights

matklad commented 5 months ago

Could you try curling the URL from before again, ideally twice in a row? (the first time will likely be a cold boot, and the second should then hit the already-running instance)

Yup, the first time around it took a minute, the second one was fast. I guess it might make sense to bump default timeouts to something like 120 seconds, rather than just 10? 10 is a reasonable number for the steady state, but with cold boots, network topology changes and what not, I think P100 could go higher than that.

kylewlacy commented 5 months ago

Oh yeah, it sounds like this was resolved by #54