Closed matklad closed 5 months ago
Huh, interesting... I hadn't seen that particular error for the handful of times I've tested from a fresh environment (although I've only really tested on Ubuntu and my good home internet connection). I remember you mentioned that you were using NixOS, so I might see if I could reproduce through that, but I have a few questions if you feel comfortable answering:
All of the downloads are handled with Reqwest, and the blob fetches specifically don't have a global timeout set (they do have a connect timeout and a read timeout at least). The registry is hosted in Fly.io, but the blobs get redirected and served from Cloudflare R2
Actually, now that I think about it, I only have Fly.io instances near Seattle, maybe I just need to expand to more Fly.io regions...
I am in Lisbon. I can curl the file manually, but it feels very slow for 4 kilobytes (about 10 seconds actually).
Okay, I've come up with a few possible explanations, and I've also made a few changes to try and address them. Let me try and summarize:
Could you try curl
ing the URL from before again, ideally twice in a row? (the first time will likely be a cold boot, and the second should then hit the already-running instance)
If it seems like things have improved, then the final test would be trying the hello world installation from scratch again (also with some extra debugging for good measure):
chmod -R +w ~/.local/share/brioche && rm -rf ~/.local/share/brioche
to remove all the locally-stored filesBRIOCHE_LOG_OUTPUT='./brioche.log' BRIOCHE_LOG_DEBUG='[]=debug' brioche install -r hello_world
If that still fails, then brioche.log
should at least give some insights
Could you try curling the URL from before again, ideally twice in a row? (the first time will likely be a cold boot, and the second should then hit the already-running instance)
Yup, the first time around it took a minute, the second one was fast. I guess it might make sense to bump default timeouts to something like 120 seconds, rather than just 10? 10 is a reasonable number for the steady state, but with cold boots, network topology changes and what not, I think P100 could go higher than that.
Oh yeah, it sounds like this was resolved by #54
If I try
I get a timeout:
If I manually bump in-code timeouts to 60s from 10s, I then get some error about "temporary DNS failure" (sadly, lost the exact text of error somewhere in the git history)