Closed blitz closed 4 years ago
That's getAddrInfo
failing from network
package, trying to resolve DNS.
It's strange to me that the system is preferring ipv6, what OS/distribution are you running the agent on?
I'm running NixOS 19.09. If it helps, I can also give you access to this box.
That would be great.
My ssh pub key: https://static.domenkozar.com/ielectric.pub
Send me an email with access details at domen@hercules-ci.com
Thanks.
I'm trying to replicate the same call but I get:
Prelude Network.Socket> defaultHints { addrFlags = [AI_ADDRCONFIG], addrSocketType = Stream }
AddrInfo {addrFlags = [AI_ADDRCONFIG], addrFamily = AF_UNSPEC, addrSocketType = Stream, addrProtocol = 0, addrAddress = *** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:78:14 in base:GHC.Err
undefined, called at Network/Socket.hsc:1628:40 in network-2.8.0.1-Hmt657UE3v349uYmvUXEvW:Network.Socket
Seems like a bug in Show
instance.
Prelude Network.Socket> foo = defaultHints { addrFlags = [AI_ADDRCONFIG], addrSocketType = Stream }
Prelude Network.Socket> getAddrInfo (Just foo) (Just "codeload.github.com") (Just "443")
[AddrInfo {addrFlags = [AI_ADDRCONFIG], addrFamily = AF_INET, addrSocketType = Stream, addrProtocol = 6, addrAddress = 140.82.114.10:443, addrCanonName = Nothing}]
I believe everything is working now, so it looks like a racing condition between network setup and agent start.
I can see that network got configured 6 seconds after the agent failed: Oct 19 01:09:38 nixos systemd[1]: Reached target Network is Online.
@blitz should be fixed in next hercules-ci-agent release, I've restarted your agent so it should work now.
Thanks for looking into this! I definitely got further now, but it just fails later, when it actually starts building:
warning: unknown setting 'sandbox-fallback'
these paths will be fetched (32.06 MiB download, 32.06 MiB unpacked):
/nix/store/9ybz7r3i3i2cy6f3h6sm0psa2kzqdhz3-bootstrap-tools.tar.xz
warning: you did not specify '--add-root'; the result might be removed by the garbage collector
A technical error occurred: ConnectionError "HttpExceptionRequest Request {\n host = \"blitz.cachix.org\"\n port = 443\n secure = True\n requestHeaders = [(\"Content-Type\",\"application/x-nix-nar\")]\n path = \"/api/v1/cache/blitz/nar\"\n queryString = \"\"\n method = \"POST\"\n proxy = Nothing\n rawBody = False\n redirectCount = 10\n responseTimeout = ResponseTimeoutDefault\n requestVersion = HTTP/1.1\n}\n (ConnectionFailure Network.Socket.getAddrInfo (called with preferred socket type/protocol: AddrInfo {addrFlags = [AI_ADDRCONFIG], addrFamily = AF_UNSPEC, addrSocketType = Stream, addrProtocol = 6, addrAddress = <assumed to be undefined>, addrCanonName = <assumed to be undefined>}, host name: Just \"blitz.cachix.org\", service name: Just \"443\"): does not exist (Name or service not known))"
That looks like your DNS servers are behaving strange, I suggest you try using google 8.8.8.8 or cloudflare 1.1.1.1 to see if that fixes it.
Changing DNS servers doesn't make a difference. Also DNS works fine from nslookup, ping, wget. Super weird.
% wget https://blitz.cachix.org/
--2019-10-20 14:49:47-- https://blitz.cachix.org/
Resolving blitz.cachix.org (blitz.cachix.org)... 34.205.214.246
Connecting to blitz.cachix.org (blitz.cachix.org)|34.205.214.246|:443... connected.
HTTP request sent, awaiting response... 200 OK
My build agent fails to fetch source code tarballs from Github:
This is weird, because fetching code via
wget
works just fine from the machine:It looks a bit like the agent wants to resolve codeload.github.com as IPv6 and that fails, because that domain doesn't have an AAAA record.