Open Zibri opened 3 years ago
I receive the same error with 2021.7.3 (with both that protocol.argotunnel.com
address and a cloudflare-gateway.com
teams address). Downgrading to 2021.7.0 resolves the issue.
Could this be related to TUN-4699: Make quick tunnels the default in cloudflared
from 2021.7.1?
I'm running proxy-dns
on a Raspberry Pi, which has been running without issue for over a year, and then suddenly broke with ~2021.7.1. Happy to help diagnose.
I receive the same error with 2021.7.3 (with both that
protocol.argotunnel.com
address and acloudflare-gateway.com
teams address). Downgrading to 2021.7.0 resolves the issue.Could this be related to
TUN-4699: Make quick tunnels the default in cloudflared
from 2021.7.1?I'm running
proxy-dns
on a Raspberry Pi, which has been running without issue for over a year, and then suddenly broke with ~2021.7.1. Happy to help diagnose.
@benbalter can you show the cloudflared command and config that you are running with that broke with 2021.7.1 onwards?
@Zibri and @benbalter can you run the following command in the environment where cloudflared is failing?
dig -t txt protocol.argotunnel.com
This is the same as https://github.com/cloudflare/cloudflared/issues/388
dig -t txt protocol.argotunnel.com
it does not return anything and times out. in egypt dns queries are very restricted. perhaps you should do the query using https dns
SRV queries are not blocked. and a few other types too. so you have 2 choices: or you use an https dns or you try other dns queries as a backup like SRV or SIG, CAA etc etc
Downgrading to 2021.7.0 resolves the issue.
Thanks for poiting this out. Also to avoid autoupdating, an easy trick is this:
# sed -i "s/2021.7.0/2025.7.0/" $(which cloudflared)
About the lookup TXT problem, we haven't yet addressed, and will soon.
About the "quick tunnel" (i.e., a no-login tunnel) causing the lookup TXT --- that seems to fail on rare situations such as those described here --- we have reverted that logic in 2021.7.4, meaning it will no longer cause that lookup.
can you show the cloudflared command and config that you are running with that broke with 2021.7.1 onwards?
I have a service defined to run /usr/local/bin/cloudflared --config /etc/cloudflared/config.yml
with the following config:
proxy-dns: true
proxy-dns-port: 5053
proxy-dns-upstream:
- https://XXX.cloudflare-gateway.com/dns-query
proxy-dns-bootstrap:
- https://1.1.1.2/dns-query
can you run the following command in the environment where cloudflared is failing?
With cloudflared running (2021.7.0), I get the "http2=100"
response, presumably as expected.
Before cloudflared bootstraps, the dig
query fails, because the system resolver (set to 127.0.0.1#53) uses cloudflared's proxy-dns as it's upstream resolver (127.0.0.1#5053).
perhaps you should do the query using https dns
It seems 2021.7.1's quick channels default introduced a dependency on being able to query that TXT record during the bootstrap process, but does so in a way that uses the system resolver, rather than the designated bootstrap resolver / DNS over HTTPS.
Similar to the discussion in https://github.com/cloudflare/cloudflared/issues/388 and above, on my network, non-DoH DNS queries are blocked entirely, meaning as before 2021.7.1, in order to maintain backwards compatibility, the bootstrap process should allow use of DoH for its initial resolution, not the system resolver.
All that said, thank you for your quick response and for maintaining such a great project! 🎉
With cloudflared running (2021.7.0), I get the "http2=100" response, presumably as expected.
Hi @benbalter! Can you share the stdout logs when you run the same command with v2021.7.0 please?
non-DoH DNS queries are blocked entirely, meaning as before 2021.7.1, in order to maintain backwards compatibility, the bootstrap process should allow use of DoH for its initial resolution, not the system resolver.
This should still happen if you were to use the command cloudflared proxy-dns
. Can you try it out with the latest version and let me know if that works for you?
Can you share the stdout logs when you run the same command with v2021.7.0 please?
Of course. Thanks for the quick reply. Here's the output on 2021.7.0:
$ dig -t txt protocol.argotunnel.com
; <<>> DiG 9.11.5-P4-5.1+deb10u2-Raspbian <<>> -t txt protocol.argotunnel.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63169
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;protocol.argotunnel.com. IN TXT
;; ANSWER SECTION:
protocol.argotunnel.com. 300 IN TXT "http2=100"
;; Query time: 26 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed Jul 28 19:24:24 UTC 2021
;; MSG SIZE rcvd: 97
And if I were to query cloudflared directly (bypassing the downstream pi-hole DNS server), here's the result:
$ dig -t txt protocol.argotunnel.com @127.0.0.1 -p 5053
; <<>> DiG 9.11.5-P4-5.1+deb10u2-Raspbian <<>> -t txt protocol.argotunnel.com @127.0.0.1 -p 5053
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 44008
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;protocol.argotunnel.com. IN TXT
;; ANSWER SECTION:
protocol.argotunnel.com. 277 IN TXT "http2=100"
;; Query time: 30 msec
;; SERVER: 127.0.0.1#5053(127.0.0.1)
;; WHEN: Wed Jul 28 19:26:30 UTC 2021
;; MSG SIZE rcvd: 97
Can you try it out with the latest version and let me know if that works for you?
The 2021.7.4 bootstraps as expected, both via the cloudflared
command + config file and with cloudflared proxy-dns
directly
Oops. I misspoke. Can you also do me the favour of trying cloudflared proxy-dns
out with 2021.7.3
?
Of course. Thanks for the quick reply. Here's the output on 2021.7.0:
Thanks for this. Can you also share the output of your cloudflared command please?
So I think we've understood this a bit better now.
cloudflared proxy-dns --config ...
, it uses this logic https://github.com/cloudflare/cloudflared/blob/master/cmd/cloudflared/proxydns/cmd.go#L78 to create the listener (expecting properties to be port
, upstream
and bootstrap
cloudflared --config ...
, it assumes the cloudflared tunnel
command (as per https://github.com/cloudflare/cloudflared/blob/master/cmd/cloudflared/main.go#L81 , that underneath uses the tunnelCommand), and that also starts a dns proxy listener if proxy-dns
is true, and then uses the properties https://github.com/cloudflare/cloudflared/blob/master/cmd/cloudflared/tunnel/server.go#L22 (proxy-dns-address
, port
, proxy-dns-upstream
and proxy-dns-bootstrap
)This second case therefore starts a tunnel, besides starting the dns proxy. It's very likely that you are not even using that tunnel at all. So you can just run the first case above and therefore skip the tunnel logic.
The reason why the behaviour changed is because we changed those "account-less tunnels" (where no --hostname is provided, and no tunnel is pre-created with a login) to no longer use our legacy tunnels infrastructure, and use the new one for named tunnels. This new one looks up a TXT record, and that's what you noticed. We will make cloudflared more resilient to the TXT lookup.
We've uncovered that this different behaviour (of running a tunnel next to the proxy-dns) was a regression/accidental recent change due to some bad argument handling. FYI, we will revert that
So you can just run the first case above and therefore skip the tunnel logic.
Came here to post the stdout requested above, and arrived at a similar conclusion.
That said, I may have found another bug (happy to move this to a new issue, if unrelated), in that either I don't believe cloudflared proxy-dns
is using the bootstrap resolver (either specified or default), or I don't understand what the purpose of that setting is (probably more likely).
I get similar output for cloudflared proxy-dns
on 2021.7.4. As you can see above, in both versions, cloudflared is attempting to resolve the XXX.cloudflare-gateway.com
subdomain via the 127.0.0.1#53
resolver, even though the bootstrap resolver is specified in the config (and the default resolver should be Cloudflare's IP). I can also see the XXX.cloudflare-gateway.com
requests in my #53
resolver's logs (which uses cloudflared as upstream, resulting in a timeout). cloudflared proxy-dns
with no arguments works, as it uses 1.1.1.1
as its upstream.
Is my understanding incorrect in that cloudflared proxy-dns
should use the bootstrap resolver to resolve the upstream resolver's domain at startup?
If instead I use the following config (moving 1.1.1.2 to a second upstream), when the first DNS lookup fails, it falls back to 1.1.1.2 (I believe, only for that request. since the first resolver could then be used), and resolves/proxies requests as expected:
proxy-dns: true
proxy-dns-port: 5053
proxy-dns-upstream:
- https://XXX.cloudflare-gateway.com/dns-query
- https://1.1.1.2/dns-query
Again, very grateful for your time and thoughtfulness here, and glad to hear that I found at least one bug, and it wasn't entirely my fault. Eager to hear your thoughts on the bootstrap issue, and again, if unrelated, happy to move it to a new issue. Thanks again!
if you run
cloudflared proxy-dns --config ...
One minor note, in case it impacts the above, cloudflared
takes a config argument, but it does not appear proxy-dns
does.
Placing the --config
argument after proxy-dns
results in Incorrect Usage: flag provided but not defined: -config
and placing it before results in the command succeeding, but with the config ignored.
To be clear, I'm not seeking to complain (easy enough to pass as command line vars), but wanted to share in case the change in behavior was helpful.
Sorry for getting back a bit late here guys. These issues should be fixed in the newest release. Give it a go and let us know what you think.
yesterday it was working poerfectly on ubuntu 18.04 today it fails with this error:
same goes if I change dns note: the machine is a VM inside my main pc.
on my windows host pc I can do: cloudflared tunnel --url http://192.168.1.104:XXXX
yesterday the same command worked on the guest machine (192.168.1.104) today gives that error.
any clue?