cloudflare / cloudflared

Cloudflare Tunnel client (formerly Argo Tunnel)
https://developers.cloudflare.com/cloudflare-one/connections/connect-apps/install-and-setup/tunnel-guide
Apache License 2.0
8.85k stars 778 forks source link

Failed to create quic connection and cause cloudflared container failed to run with 2022.4.0 on Docker #617

Closed darth-pika-hu closed 2 years ago

darth-pika-hu commented 2 years ago

Today, after updating the cloudflared docker from 2022.3.4 to 2022.4.0, the new quick protocol failed to connect to the server, causing the cloudflared docker container to self-destruct.

The logs can be found here:

2022-04-08T04:36:40Z INF Version 2022.4.0
2022-04-08T04:36:40Z INF GOOS: linux, GOVersion: go1.17.5, GoArch: amd64
2022-04-08T04:36:40Z INF Settings: map[no-autoupdate:true token:*****]
2022-04-08T04:36:40Z INF Generated Connector ID: d4bc3f69-ce1c-451a-af34-b688d50015f2
2022-04-08T04:36:40Z INF Will be fetching remotely managed configuration from Cloudflare API. Defaulting to protocol: quic
2022-04-08T04:36:40Z INF Initial protocol quic
2022-04-08T04:36:40Z INF Starting metrics server on 127.0.0.1:33206/metrics
2022-04-08T04:36:45Z ERR Failed to create new quic connection error="failed to dial to edge: timeout: no recent network activity" connIndex=0
2022-04-08T04:36:45Z ERR Serve tunnel error error="failed to dial to edge: timeout: no recent network activity" connIndex=0
...
2022-04-08T04:37:57Z INF Tunnel server stopped
2022-04-08T04:37:57Z INF Metrics server stopped
2022-04-08T04:37:57Z ERR Initiating shutdown error="failed to dial to edge: timeout: no recent network activity"
failed to dial to edge: timeout: no recent network activity

I attempted to create a new container with a 4.0 image, as well as to update from 3.4 to 4.0 within the 3.4 container, but neither worked.

2022.3.4 is perfectly functional, because it just use the http2 protocol,

If the quic protocol fails, I believe the right connection action is to fall back to http2, NOT keep trying 3 times then self-termination.

nmldiegues commented 2 years ago

Hello @darth-pika-hu

You should be able to make protocol quic work by allowing egress UDP to 7844 on your docker infrastructure: https://developers.cloudflare.com/cloudflare-one/connections/connect-apps/configuration/ports-and-ips/

We are in the process of rolling out quic to everyone. In cases where it cannot connect, it fallsback to http2.

However, in your case, you are using a new Tunnel. I can tell that because of Will be fetching remotely managed configuration from Cloudflare API. Defaulting to protocol: quic For new Tunnels, we have opted them into quic "forcefully" since the admin is much more likely to be on top of things and be willing to open UDP connectivity.

If for some reason you cannot really allow UDP egress, then you can still make it http2 as per https://developers.cloudflare.com/cloudflare-one/connections/connect-apps/configuration/remote-management/

darth-pika-hu commented 2 years ago

I have read the docs and opened the 7844 port. And I tried on different machines and got the same results. I think it is a docker version-only bug. I tried updating from 3.4 to 4.0 within an existing container. In this case, the tunnel is not new. But I got the same errors.

nmldiegues commented 2 years ago

I absolutely understand the frustration @darth-pika-hu

I can guarantee this is a problem with your network not allowing egress to 7844 UDP. E.g., this docker run --rm -it docker.io/cloudflare/cloudflared:latest tunnel --hello-world runs just fine from my infrastructure, and we can see thousand of other users that are doing the same just fine.

As noted above, you can force your Tunnel to run with http2 even though it is managed in the UI (and the UI does not yet allow to control that). Check out https://developers.cloudflare.com/cloudflare-one/connections/connect-apps/configuration/remote-management/ for the details

Let me also reiterate on the reasoning behind this: we're "forcing" quic protocol because we (Cloudflare) believe it is a big part of the future of the Internet. But many networks still block UDP. We must force admins behind those networks to feel that "pain" in some way, so that people are aware and begin allowing UDP egress. E.g., our Private DNS resolution, which uses UDP, only works with QUIC protocol. So it is frustrating for users to spin up Tunnels defaulting to http2 (that does not support UDP proxying) and not have Private DNS resolution working (see https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/private-net/private-hostnames-ips/#update-cloudflared)

darth-pika-hu commented 2 years ago

@nmldiegues Thank you for the reply. However, I have checked all the rules, and nothing blocking the 7844 port. I know how to use http2 but just want to give quic a chance. Then I found this article https://blog.cloudflare.com/getting-cloudflare-tunnels-to-connect-to-the-cloudflare-network-with-quic/ I don't know what to say. It's written by one of you guys. A similar situation and he/she found a bug. Why are you so sure it is my network issue and not a new bug?

nmldiegues commented 2 years ago

A similar situation and he/she found a bug. Why are you so sure it is my network issue and not a new bug?

Because that blog post describes a past problem and how we solved it in our edge. That's why we have already so many QUIC tunnels connected to us. Otherwise they wouldn't be able to.

However, I have checked all the rules, and nothing blocking the 7844 port

Maybe the problem is with your ISP. We never know... You can use mtr (https://www.cloudflare.com/en-gb/learning/network-layer/what-is-mtr/) against region1.argotunnel.com port 7844 and compare TCP vs UDP

darth-pika-hu commented 2 years ago

@nmldiegues I definitely need your help here. Still cannot get it work I called AT&T today and made sure port 7844 was enabled both for UDP and TCP egress. Installed WinMTR and disabled the firewall.

Here is the result for region1.argotunnel.com:

region1.argotunnel.com

Host % Sent Recv Best Avrg Wrst Last
10.0.1.1 0 39 39 0 0 2 0
xxxx.lightspeed.irvnca.sbcglobal.net 0 39 39 1 1 3 1
No response from host 100 7 0 0 0 0 0
75.29.48.90 0 39 39 2 2 4 2
12.242.115.44 0 39 39 4 8 13 7
192.205.37.146 0 39 39 6 8 21 7
No response from host 100 7 0 0 0 0 0
4.30.195.50 27 19 14 0 11 28 7
141.101.72.32 0 39 39 6 7 16 7
198.41.192.167 0 39 39 6 6 8 6

Here is the result for region2.argotunnel.com:

region2.argotunnel.com

Host % Sent Recv Best Avrg Wrst Last
10.0.1.1 0 20 20 0 0 1 0
xxxx.lightspeed.irvnca.sbcglobal.net 0 20 20 0 0 2 1
No response from host 100 4 0 0 0 0 0
75.29.48.90 0 20 20 1 2 3 2
12.242.115.44 0 20 20 4 7 11 7
192.205.37.146 0 20 20 5 7 17 6
ae1.13.ear2.SanDiego1.level3.net 0 20 20 11 11 13 12
64.156.196.174 0 20 20 13 13 15 14
198.41.200.13 0 20 20 11 11 13 11

Here is a random website result for comparison:

random website

Host % Sent Recv Best Avrg Wrst Last
10.0.1.1 0 12 12 0 0 1 0
xxxx.lightspeed.irvnca.sbcglobal.net 0 12 12 0 1 2 1
No response from host 100 3 0 0 0 0 0
75.29.48.90 0 12 12 1 2 3 2
12.242.115.44 0 12 12 5 8 11 11
192.205.37.146 0 12 12 5 6 9 6
No response from host 100 3 0 0 0 0 0
4.30.195.50 50 4 2 6 23 40 6
172.70.212.2 0 12 12 5 6 12 6
104.26.5.110 0 12 12 5 5 6 6

I also used powershell to check the connection:

PS C:\Users\xxx> test-netconnection -computername region1.argotunnel.com -port 7844

ComputerName     : region1.argotunnel.com
RemoteAddress    : 198.41.192.7
RemotePort       : 7844
InterfaceAlias   : Ethernet
SourceAddress    : 10.0.1.213
TcpTestSucceeded : True

I am a little confused. It seems the data has no problem to reach the final destination. But once trying to use cloudflared windows version to establish QUIC connection, I got the same error message like Docker.

darth-pika-hu commented 2 years ago

@nmldiegues Okay. I just did something crazy, and I think it's something I should do at the start. Now I can say with certainty that the issue is most likely either with the Argo Tunnel server UDP network policy or something on Cloudflare side. I experimented with the cloudflared quic protocol on both my home and company servers. There was no joy.

So, on one of the company's servers in the office, I set up an OpenVPN server (UDP protocol at port 7844). Then I try to connect to the VPN server in the office using the server at home that is having issues with cloudflared quic protocol connection. What do you think happened? Successful! This means that UDP and port 7844 are working properly on my network. The last thing that comes to mind is that the Argo Tunnel server may have a messed-up UDP network policy that prevents connections from my home and my company's IP addresses from being accepted.

sudarshan-reddy commented 2 years ago

@darth-pika-hu : Can you show us a tcpdump or OpenVPN logs that show traffic flowing as UDP?

Also tcpdumps of what happens when you’re trying to make QUIC connections with cloudflared will be useful.

Also, please give us detailed information about your environment. What’s your own network policy like? Do you have other network based processes running? Specific iptables/nftables rules?

I get that troubleshooting this must be frustrating. But as far as I can see, all our systems are accepting and proxying QUIC connections even as we speak.

I highly recommend you follow the bug template your issue is edited over.

darth-pika-hu commented 2 years ago

@sudarshan-reddy

Can you show us a tcpdump or OpenVPN logs that show traffic flowing as UDP?

How about the openvpn configuration:

dev tun
--
  | tls-client
  |  
  | remote 108.xxx.16.xx 7844 udp

As you can see, I forced 7844 udp . I also configured an ingress firewall rule on the OpenVPN server to allow just 7844 udp.

Also, please give us detailed information about your environment. What’s your own network policy like? Do you have other network based processes running? Specific iptables/nftables rules?

To make the debugging process easier, I connected a non-production server to the internet using a static IP address at home. In the office, I also connected a VM to the internet using a static IP address. Both utilize 1.1.1.1 as their DNS server, with no rules. Simple and insane. Even with this configuration, neither of them can connect to the Argo tunnel server using the quic protocol.

I get that troubleshooting this must be frustrating. But as far as I can see, all our systems are accepting and proxying QUIC connections even as we speak.

I suppose some could. Would you please double-check the rules on your end for the 108.0.0.0 IP range? Both my company's and my home's static IP addresses began with 108.

sudarshan-reddy commented 2 years ago

This does not prove that the packets that left and received were actual UDP. It does not prove that your system is not dropping the packets. The best way to verify that is to look at actual packets. You quoted my blog post. If you see thats the first step of troubleshooting from my side.

We do not have policy specific rules globally.

We still do not have any details about what your environment is like. There isn’t much to go on here really except empirical description.

darth-pika-hu commented 2 years ago

Here is my offer: What if I set up a virtual machine for you and let you do whatever you need to do? Let me know the best way to privatly contact you.

We still do not have any details about what your environment is like. There isn’t much to go on here really except empirical description.

Begin with a cloudflared Docker container on a Linux server, followed by a cloudflared installation file on a Windows 10 virtual machine and a Windows 11 virtual machine. Docker on the Linux server utilizes an AMD CPU, whereas the Windows 10 VM uses an INTEL CPU and Windows 11 uses an AMD CPU.

Edited on 04/11/2022: @sudarshan-reddy @nmldiegues Today is Monday, I'm at work, and I just used wireshark's "udp.port==7844" filter to check the openvpn connection between the VM and the server. It is UDP and uses port 7844, as seen below:

Frame 26931: 1347 bytes on wire (10776 bits), 1347 bytes captured (10776 bits) on interface \Device\NPF_{4CA37862-9848-4728-9D2B-98465194344F}, id 0
    Interface id: 0 (\Device\NPF_{4CA37862-9848-4728-9D2B-98465194344F})
    Encapsulation type: Ethernet (1)
    Arrival Time: Apr 11, 2022 11:09:38.246542000 Pacific Daylight Time
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1649700578.246542000 seconds
    [Time delta from previous captured frame: 0.000127000 seconds]
    [Time delta from previous displayed frame: 0.000127000 seconds]
    [Time since reference or first frame: 410.840630000 seconds]
    Frame Number: 26931
    Frame Length: 1347 bytes (10776 bits)
    Capture Length: 1347 bytes (10776 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: eth:ethertype:ip:udp:openvpn]
    [Coloring Rule Name: UDP]
    [Coloring Rule String: udp]
Ethernet II, Src: MS-NLB-PhysServer-17_32:22:20:d7 (02:11:32:22:20:d7), Dst: eero_3f:55:12 (c4:f1:74:3f:55:12)
    Destination: eero_3f:55:12 (c4:f1:74:3f:55:12)
    Source: MS-NLB-PhysServer-17_32:22:20:d7 (02:11:32:22:20:d7)
    Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 10.0.1.213, Dst: xxxxxxxxxxx
    0100 .... = Version: 4
    .... 0101 = Header Length: 20 bytes (5)
    Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
    Total Length: 1333
    Identification: 0x6a30 (27184)
    Flags: 0x00
    ...0 0000 0000 0000 = Fragment Offset: 0
    Time to Live: 128
    Protocol: UDP (17)
    Header Checksum: 0x0000 [validation disabled]
    [Header checksum status: Unverified]
    Source Address: 10.0.1.213
    Destination Address: xxxxxxxx
User Datagram Protocol, Src Port: 1194, Dst Port: 7844
    Source Port: 1194
    Destination Port: 7844
    Length: 1313
    Checksum: 0x8e2f [unverified]
    [Checksum Status: Unverified]
    [Stream index: 0]
    [Timestamps]
    UDP payload (1305 bytes)
OpenVPN Protocol
    Type: 0x48 [opcode/key_id]
    Peer ID: 0
    Data (1301 bytes)

And here's the log for cloudflared on the 7844 port:

Frame 4170: 1294 bytes on wire (10352 bits), 1294 bytes captured (10352 bits) on interface \Device\NPF_{4CA37862-9848-4728-9D2B-98465194344F}, id 0
    Interface id: 0 (\Device\NPF_{4CA37862-9848-4728-9D2B-98465194344F})
    Encapsulation type: Ethernet (1)
    Arrival Time: Apr 11, 2022 11:21:40.311984000 Pacific Daylight Time
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1649701300.311984000 seconds
    [Time delta from previous captured frame: 0.000149000 seconds]
    [Time delta from previous displayed frame: 0.000149000 seconds]
    [Time since reference or first frame: 81.330803000 seconds]
    Frame Number: 4170
    Frame Length: 1294 bytes (10352 bits)
    Capture Length: 1294 bytes (10352 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: eth:ethertype:ip:udp:quic:tls]
    [Coloring Rule Name: UDP]
    [Coloring Rule String: udp]
Ethernet II, Src: MS-NLB-PhysServer-17_32:22:20:d7 (02:11:32:22:20:d7), Dst: eero_3f:55:12 (c4:f1:74:3f:55:12)
    Destination: eero_3f:55:12 (c4:f1:74:3f:55:12)
    Source: MS-NLB-PhysServer-17_32:22:20:d7 (02:11:32:22:20:d7)
    Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 10.0.1.213, Dst: 198.41.200.193
    0100 .... = Version: 4
    .... 0101 = Header Length: 20 bytes (5)
    Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
    Total Length: 1280
    Identification: 0x2f95 (12181)
    Flags: 0x40, Don't fragment
    ...0 0000 0000 0000 = Fragment Offset: 0
    Time to Live: 128
    Protocol: UDP (17)
    Header Checksum: 0x0000 [validation disabled]
    [Header checksum status: Unverified]
    Source Address: 10.0.1.213
    Destination Address: 198.41.200.193
User Datagram Protocol, Src Port: 50246, Dst Port: 7844
    Source Port: 50246
    Destination Port: 7844
    Length: 1260
    Checksum: 0x9fbd [unverified]
    [Checksum Status: Unverified]
    [Stream index: 50]
    [Timestamps]
        [Time since first frame: 3.014733000 seconds]
        [Time since previous frame: 0.000149000 seconds]
    UDP payload (1252 bytes)
QUIC IETF
    QUIC Connection information
    [Packet Length: 1252]
    1... .... = Header Form: Long Header (1)
    .1.. .... = Fixed Bit: True
    ..00 .... = Packet Type: Initial (0)
    .... 00.. = Reserved: 0
    .... ..01 = Packet Number Length: 2 bytes (1)
    Version: 1 (0x00000001)
    Destination Connection ID Length: 11
    Destination Connection ID: 2f40a59f8028404f1afd5d
    Source Connection ID Length: 0
    Token Length: 0
    Length: 1231
    Packet Number: 8
    Payload: 01dafabdc667ed6f3c2c4f518da1dc9ae247c4b0ab9899c9e7a693fdeef9bbb40a8ecd2c…
    PADDING Length: 862
        Frame Type: PADDING (0x0000000000000000)
        [Padding Length: 862]
    CRYPTO
        Frame Type: CRYPTO (0x0000000000000006)
        Offset: 0
        Length: 347
        Crypto Data
        TLSv1.3 Record Layer: Handshake Protocol: Client Hello
            Handshake Protocol: Client Hello

Please advice

darth-pika-hu commented 2 years ago

@sudarshan-reddy Here is the tcpdump log generated while openvpn client on the Windows Virtual Machine connected to the server:

listening on \Device\{4CA37862-9848-4728-9D2B-98465194344F}, link-type EN10MB (Ethernet), capture size 262144 bytes
15:06:05.655120 IP 10.0.1.213.1194 > xxxx.7844: UDP, length 77
15:06:05.998815 IP 10.0.1.213.1194 > xxxx.7844: UDP, length 77
15:06:06.188366 IP xxxx.7844 > 10.0.1.213.1194: UDP, length 77
15:06:06.188366 IP 10.0.1.213.1194 > xxxx.7844: UDP, length 65
15:06:06.201661 IP 10.0.1.213.1194 > xxxx.7844: UDP, length 346
15:06:06.389166 IP xxxx.7844 > 10.0.1.213.1194: UDP, length 65
15:06:06.398070 IP xxxx.7844 > 10.0.1.213.1194: UDP, length 1422
15:06:06.398070 IP xxxx.7844 > 10.0.1.213.1194: UDP, length 983
15:06:06.398070 IP 10.0.1.213.1194 > xxxx.7844: UDP, length 65

The following is the tcpdump log generated while cloudflared attempted to connect through QUIC:

listening on \Device\{4CA37862-9848-4728-9D2B-98465194344F}, link-type EN10MB (Ethernet), capture size 262144 bytes
11:48:25.201412 IP 10.0.1.213.61706 > 198.41.200.113.7844: UDP, length 1252
11:48:25.410162 IP 10.0.1.213.61706 > 198.41.200.113.7844: UDP, length 1252
11:48:25.410162 IP 10.0.1.213.61706 > 198.41.200.113.7844: UDP, length 1252
11:48:25.813411 IP 10.0.1.213.61706 > 198.41.200.113.7844: UDP, length 1252
11:48:25.813411 IP 10.0.1.213.61706 > 198.41.200.113.7844: UDP, length 1252
11:48:26.617403 IP 10.0.1.213.61706 > 198.41.200.113.7844: UDP, length 1252
11:48:26.617403 IP 10.0.1.213.61706 > 198.41.200.113.7844: UDP, length 1252
11:48:28.217673 IP 10.0.1.213.61706 > 198.41.200.113.7844: UDP, length 1252
11:48:28.217673 IP 10.0.1.213.61706 > 198.41.200.113.7844: UDP, length 1252

PS: configuring tcpdump on Windows is a hassle.

Please let me know if you are interested in my proposal:

What if I set up a virtual machine for you and let you do whatever you need to do? Let me know the best way to privatly contact you.

nmldiegues commented 2 years ago

We'll likely make a new release of cloudflared that fallsback to http2 from quic when this scenario happens. This is already the case normally when the quic protocol is picked automatically (and not configured by the user). We will make it so for Tunnels managed by the UI as well.

In practice we'll want to promote quic usage, but this likely will need some tool to help troubleshoot this sort of scenarios, which are time consuming, and for which we do not currently have bandwidth to attack.

darth-pika-hu commented 2 years ago

@nmldiegues Thank you for providing an update. Now I finally realized we were just white mice to you guys. All the changes you guys made are just for your goal or “the future” not for current users. The solution to the problem? Sorry, we are too busy and don’t care.

nmldiegues commented 2 years ago

Thanks for all the iterations here. For now we'll go with 2022.4.1 cloudflared version that should backoff to http2 even for UI managed Tunnels as a workaround for this.

darth-pika-hu commented 2 years ago

@nmldiegues and @sudarshan-reddy Want to give you guys a heads up. QUIC starts working suddenly. Update: stopped working again.

nmldiegues commented 2 years ago

@nmldiegues and @sudarshan-reddy Want to give you guys a heads up. QUIC starts working suddenly. Update: stopped working again.

Well, we certainly haven't done anything over the weekend.

darth-pika-hu commented 2 years ago

@nmldiegues and @sudarshan-reddy Want to give you guys a heads up. QUIC starts working suddenly. Update: stopped working again.

Well, we certainly haven't done anything over the weekend.

Well, I didn't change anything too. And it is not during the weekend. It is last friday. Maybe check your server log?

nmldiegues commented 2 years ago

Can you share your Tunnel ID? It's not a secret, no one can do anything with it on your behalf (but it allows us to look into it from our perspective)

darth-pika-hu commented 2 years ago

Can you share your Tunnel ID? It's not a secret, no one can do anything with it on your behalf (but it allows us to look into it from our perspective)

Understand. Willing to help. Here are the tunnel ID: eaee69fd-5bd9-4807-9352-a912bf81fd26 a89ac8f5-c23c-417f-b18d-408de86e7a3a 298c57ed-965d-494b-81ef-eb608c69e254 3d089c3b-3b4f-401d-8b1d-b8b53699a85c

If you guys are interested in using a VM to do more tests, let me know.

nmldiegues commented 2 years ago

Thanks for those. I may have found something interesting, and we'll pursue it internally.

For now, the gist is that cloudflared connects to 2 data-centers (for reliability, 2 connections in each). It looks like your cloudflared is unable to connect with QUIC to a specific data-center only. Since the 12th of April I see many successful QUIC connections to various data-centers, but I see HTTP2 connections only to that one specific data-center.

So, it seems like you're using QUIC in 2 of the 4 connections. The logs that you see for falling back will generally be for connection index 0 and 2 (which connect to one anycast region), whereas connection index 1 and 3 will successfully connect with QUIC (to another anycast region).

nmldiegues commented 2 years ago

We've uncovered that a small number of data-centers was indeed having this problem where they would not take in QUIC connections. We'll fix that and post back, at which point all your connections (and not just half of them) should work with QUIC fine.

darth-pika-hu commented 2 years ago

Thanks for those. I may have found something interesting, and we'll pursue it internally.

For now, the gist is that cloudflared connects to 2 data-centers (for reliability, 2 connections in each). It looks like your cloudflared is unable to connect with QUIC to a specific data-center only. Since the 12th of April I see many successful QUIC connections to various data-centers, but I see HTTP2 connections only to that one specific data-center.

So, it seems like you're using QUIC in 2 of the 4 connections. The logs that you see for falling back will generally be for connection index 0 and 2 (which connect to one anycast region), whereas connection index 1 and 3 will successfully connect with QUIC (to another anycast region).

Well. I was right.

darth-pika-hu commented 2 years ago

Update: today is 04/25; QUIC is working again for all my tunnels. Let's see how long they will last.

nmldiegues commented 2 years ago

Update: today is 04/25; QUIC is working again for all my tunnels. Let's see how long they will last.

Yes, we actioned the changes in the very small subset) data centers that were not accepting QUIC as they should. So you should now have QUIC on all your 4 connections for good (not ephemerally).

darth-pika-hu commented 2 years ago

@nmldiegues Wonderful. QUIC for Cloudflare tunnel seems one step closer to its success. I wish you good luck with this project.

befantasy commented 1 year ago

@nmldiegues Wonderful. QUIC for Cloudflare tunnel seems one step closer to its success. I wish you good luck with this project.

I had the same problem. Not all of the 4 connection was able to established with quic. Is there any way to force specify the datacenter which I connect to?

tleerai commented 1 year ago

This is an epic case of RTFM gone wrong, heh. But hey, I get it.

Kation commented 1 year ago

@nmldiegues There still are some data-centers fail with quic

DCCInterstellar commented 1 year ago

As noted above, you can force your Tunnel to run with http2 even though it is managed in the UI (and the UI does not yet allow to control that). Check out https://developers.cloudflare.com/cloudflare-one/connections/connect-apps/configuration/remote-management/ for the details

I'm having this issue as well. My ISP is with T-Mobile Home Internet and sadly they don't offer quic. I went to the website you provided, however, this won't work with my Unraid System. By any chance does Cloudflare have an Unraid command to force the tunneling to use http2 instead of quic?

Thank you!!

Kation commented 1 year ago

@DCCInterstellar Publish Argument tunnel --protocol http2 --no-autoupdate run --token xxx

by-justin commented 1 year ago

@DCCInterstellar Publish Argument tunnel --protocol http2 --no-autoupdate run --token xxx

Thanks, it works!

I live in China and use cloudflare tunnel to hide my traffic amount other cloudflare users. QUIC completely fails this purpose as no other user connect to cloudflare in QUIC, so my connection was interrupted multiple times a day and I had to manually restart the container. By changing QUIC to http2, the problem is completely solved.

Edit in 23 Mar:

The same problem arises again recently and this time both http2 and quic failed. It seems like the traffic to cloudflare data center in USA was completely banned by GFW. For anyone who has the same problem, I had switched to use frp on my own server.

amatteo78 commented 1 year ago

I have same problem with QUIC, I need it for route internal private network, fallback to https works but I can't use it for previus motivation. This my log: 2023-02-24T19:21:08Z INF Starting metrics server on 127.0.0.1:49927/metrics 2023-02-24T19:21:13Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: no recent network activity" connIndex=0 ip=198.41.192.37 2023-02-24T19:21:13Z INF Retrying connection in up to 2s connIndex=0 ip=198.41.192.37 2023-02-24T19:21:19Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: no recent network activity" connIndex=0 ip=198.41.192.107 2023-02-24T19:21:19Z INF Retrying connection in up to 4s connIndex=0 ip=198.41.192.107 2023-02-24T19:21:27Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: no recent network activity" connIndex=0 ip=198.41.192.227 2023-02-24T19:21:27Z INF Retrying connection in up to 8s connIndex=0 ip=198.41.192.227 2023-02-24T19:21:32Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: no recent network activity" connIndex=0 ip=198.41.200.73 2023-02-24T19:21:32Z INF Retrying connection in up to 16s connIndex=0 ip=198.41.200.73 2023-02-24T19:21:43Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: no recent network activity" connIndex=0 ip=198.41.192.167 2023-02-24T19:21:43Z INF Retrying connection in up to 32s connIndex=0 ip=198.41.192.167 2023-02-24T19:21:58Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: no recent network activity" connIndex=0 ip=198.41.192.37 2023-02-24T19:21:58Z INF Retrying connection in up to 1m4s connIndex=0 ip=198.41.192.37 2023-02-24T19:22:23Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: no recent network activity" connIndex=0 ip=198.41.200.233 2023-02-24T19:22:23Z INF Retrying connection in up to 1m4s connIndex=0 ip=198.41.200.233 2023-02-24T19:22:41Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: no recent network activity" connIndex=0 ip=198.41.200.53 2023-02-24T19:22:41Z INF Retrying connection in up to 1m4s connIndex=0 ip=198.41.200.53 2023-02-24T19:23:01Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: no recent network activity" connIndex=0 ip=198.41.200.73 2023-02-24T19:23:01Z INF Retrying connection in up to 1m4s connIndex=0 ip=198.41.200.73 2023-02-24T19:23:26Z WRN If this log occurs persistently, and cloudflared is unable to connect to Cloudflare Network withquicprotocol, then most likely your machine/network is getting its egress UDP to port 7844 (or others) blocked or dropped. Make sure to allow egress connectivity as per https://developers.cloudflare.com/cloudflare-one/connections/connect-apps/configuration/ports-and-ips/ If you are using private routing to this Tunnel, then UDP (and Private DNS Resolution) will not work unless your cloudflared can connect with Cloudflare Network withquic. connIndex=0 ip=198.41.200.73 2023-02-24T19:23:26Z INF Switching to fallback protocol http2 connIndex=0 ip=198.41.200.73 2023-02-24T19:23:27Z INF Connection 0052f011-91d6-4a8b-a1e2-96b174ae8269 registered with protocol: http2 connIndex=0 ip=198.41.200.13 location=FCO 2023-02-24T19:23:27Z INF Connection 7243036b-689e-4774-9347-df7ccc4ac6db registered with protocol: http2 connIndex=1 ip=198.41.192.7 location=MXP 2023-02-24T19:23:27Z INF Warp-routing is enabled 2023-02-24T19:23:27Z INF Updated to new configuration config="{\"warp-routing\":{\"enabled\":true}}" version=1 2023-02-24T19:23:28Z INF Connection b88a9007-b4f4-4a75-9c07-1fb8a28d679a registered with protocol: http2 connIndex=2 ip=198.41.200.193 location=FCO 2023-02-24T19:23:30Z INF Connection 45bf6fc2-2a7d-4cb6-94bc-10ae8cb3044c registered with protocol: http2 connIndex=3 ip=198.41.192.27 location=MXP

zzduci commented 1 year ago

@DCCInterstellar Publish Argument tunnel --protocol http2 --no-autoupdate run --token xxx

Thanks, it works!

I live in China and use cloudflare tunnel to hide my traffic amount other cloudflare users. QUIC completely fails this purpose as no other user connect to cloudflare in QUIC, so my connection was interrupted multiple times a day and I had to manually restart the container. By changing QUIC to http2, the problem is completely solved.

可以变成健康状态,但是退出这个命令cloudflared tunnel --protocol http2 --no-autoupdate run --token <自己的token>之后,又变成了降级状态

brpaz commented 1 year ago

Same problem with 2023.05 version. Here are my debug logs:

2023-05-13T13:53:12Z INF ICMP proxy will use 10.0.0.47 as source for IPv4
2023-05-13T13:53:12Z INF ICMP proxy will use fe80::ec90:adff:feb5:4a94 in zone eth0 as source for IPv6
2023-05-13T13:53:13Z DBG edge discovery: looking up edge SRV record domain=_v2-origintunneld._tcp.argotunnel.com event=0
2023-05-13T13:53:13Z DBG edge discovery: resolved edge addresses addresses=["198.41.192.227","198.41.192.7","198.41.192.27","198.41.192.107","198.41.192.47","198.41.192.57","198.41.192.37","198.41.192.67","198.41.192.167","198.41.192.77","2606:4700:a0::9","2606:4700:a0::1","2606:4700:a0::3","2606:4700:a0::4","2606:4700:a0::6","2606:4700:a0::5","2606:4700:a0::10","2606:4700:a0::7","2606:4700:a0::8","2606:4700:a0::2"] event=0
2023-05-13T13:53:13Z DBG edge discovery: resolved edge addresses addresses=["198.41.200.233","198.41.200.33","198.41.200.113","198.41.200.23","198.41.200.13","198.41.200.43","198.41.200.53","198.41.200.63","198.41.200.193","198.41.200.73","2606:4700:a8::4","2606:4700:a8::6","2606:4700:a8::9","2606:4700:a8::1","2606:4700:a8::5","2606:4700:a8::7","2606:4700:a8::10","2606:4700:a8::8","2606:4700:a8::3","2606:4700:a8::2"] event=0
2023-05-13T13:53:17Z INF Starting metrics server on [::]:2000/metrics
2023-05-13T13:53:17Z DBG edge discovery: looking up edge SRV record domain=_v2-origintunneld._tcp.argotunnel.com event=0
2023-05-13T13:53:18Z DBG edge discovery: resolved edge addresses addresses=["198.41.192.67","198.41.192.227","198.41.192.37","198.41.192.47","198.41.192.107","198.41.192.77","198.41.192.167","198.41.192.27","198.41.192.7","198.41.192.57","2606:4700:a0::5","2606:4700:a0::6","2606:4700:a0::4","2606:4700:a0::2","2606:4700:a0::8","2606:4700:a0::9","2606:4700:a0::7","2606:4700:a0::10","2606:4700:a0::3","2606:4700:a0::1"] event=0
2023-05-13T13:53:18Z DBG edge discovery: resolved edge addresses addresses=["198.41.200.193","198.41.200.73","198.41.200.33","198.41.200.13","198.41.200.63","198.41.200.233","198.41.200.53","198.41.200.23","198.41.200.113","198.41.200.43","2606:4700:a8::4","2606:4700:a8::1","2606:4700:a8::9","2606:4700:a8::2","2606:4700:a8::6","2606:4700:a8::5","2606:4700:a8::8","2606:4700:a8::3","2606:4700:a8::7","2606:4700:a8::10"] event=0
2023-05-13T13:53:18Z DBG edge discovery: giving new address to connection connIndex=0 event=0 ip=198.41.200.113
2023/05/13 13:53:18 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
2023-05-13T13:53:24Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: no recent network activity" connIndex=0 event=0 ip=198.41.200.113
2023-05-13T13:53:24Z DBG edge discovery: giving new address to connection available=19 connIndex=0 event=0 ip=198.41.200.233
2023-05-13T13:53:24Z INF Retrying connection in up to 2s connIndex=0 event=0 ip=198.41.200.113
2023-05-13T13:53:24Z DBG edge discovery: returning same edge address back to pool connIndex=0 event=0 ip=198.41.200.233
2023-05-13T13:53:29Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: no recent network activity" connIndex=0 event=0 ip=198.41.200.233
2023-05-13T13:53:29Z DBG edge discovery: giving new address to connection available=19 connIndex=0 event=0 ip=198.41.200.53
2023-05-13T13:53:29Z INF Retrying connection in up to 4s connIndex=0 event=0 ip=198.41.200.233
2023-05-13T13:53:31Z DBG edge discovery: returning same edge address back to pool connIndex=0 event=0 ip=198.41.200.5
DCCInterstellar commented 1 year ago

Same problem with 2023.05 version.

Here are my debug logs:


2023-05-13T13:53:12Z INF ICMP proxy will use 10.0.0.47 as source for IPv4

2023-05-13T13:53:12Z INF ICMP proxy will use fe80::ec90:adff:feb5:4a94 in zone eth0 as source for IPv6

2023-05-13T13:53:13Z DBG edge discovery: looking up edge SRV record domain=_v2-origintunneld._tcp.argotunnel.com event=0

2023-05-13T13:53:13Z DBG edge discovery: resolved edge addresses addresses=["198.41.192.227","198.41.192.7","198.41.192.27","198.41.192.107","198.41.192.47","198.41.192.57","198.41.192.37","198.41.192.67","198.41.192.167","198.41.192.77","2606:4700:a0::9","2606:4700:a0::1","2606:4700:a0::3","2606:4700:a0::4","2606:4700:a0::6","2606:4700:a0::5","2606:4700:a0::10","2606:4700:a0::7","2606:4700:a0::8","2606:4700:a0::2"] event=0

2023-05-13T13:53:13Z DBG edge discovery: resolved edge addresses addresses=["198.41.200.233","198.41.200.33","198.41.200.113","198.41.200.23","198.41.200.13","198.41.200.43","198.41.200.53","198.41.200.63","198.41.200.193","198.41.200.73","2606:4700:a8::4","2606:4700:a8::6","2606:4700:a8::9","2606:4700:a8::1","2606:4700:a8::5","2606:4700:a8::7","2606:4700:a8::10","2606:4700:a8::8","2606:4700:a8::3","2606:4700:a8::2"] event=0

2023-05-13T13:53:17Z INF Starting metrics server on [::]:2000/metrics

2023-05-13T13:53:17Z DBG edge discovery: looking up edge SRV record domain=_v2-origintunneld._tcp.argotunnel.com event=0

2023-05-13T13:53:18Z DBG edge discovery: resolved edge addresses addresses=["198.41.192.67","198.41.192.227","198.41.192.37","198.41.192.47","198.41.192.107","198.41.192.77","198.41.192.167","198.41.192.27","198.41.192.7","198.41.192.57","2606:4700:a0::5","2606:4700:a0::6","2606:4700:a0::4","2606:4700:a0::2","2606:4700:a0::8","2606:4700:a0::9","2606:4700:a0::7","2606:4700:a0::10","2606:4700:a0::3","2606:4700:a0::1"] event=0

2023-05-13T13:53:18Z DBG edge discovery: resolved edge addresses addresses=["198.41.200.193","198.41.200.73","198.41.200.33","198.41.200.13","198.41.200.63","198.41.200.233","198.41.200.53","198.41.200.23","198.41.200.113","198.41.200.43","2606:4700:a8::4","2606:4700:a8::1","2606:4700:a8::9","2606:4700:a8::2","2606:4700:a8::6","2606:4700:a8::5","2606:4700:a8::8","2606:4700:a8::3","2606:4700:a8::7","2606:4700:a8::10"] event=0

2023-05-13T13:53:18Z DBG edge discovery: giving new address to connection connIndex=0 event=0 ip=198.41.200.113

2023/05/13 13:53:18 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.

2023-05-13T13:53:24Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: no recent network activity" connIndex=0 event=0 ip=198.41.200.113

2023-05-13T13:53:24Z DBG edge discovery: giving new address to connection available=19 connIndex=0 event=0 ip=198.41.200.233

2023-05-13T13:53:24Z INF Retrying connection in up to 2s connIndex=0 event=0 ip=198.41.200.113

2023-05-13T13:53:24Z DBG edge discovery: returning same edge address back to pool connIndex=0 event=0 ip=198.41.200.233

2023-05-13T13:53:29Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: no recent network activity" connIndex=0 event=0 ip=198.41.200.233

2023-05-13T13:53:29Z DBG edge discovery: giving new address to connection available=19 connIndex=0 event=0 ip=198.41.200.53

2023-05-13T13:53:29Z INF Retrying connection in up to 4s connIndex=0 event=0 ip=198.41.200.233

2023-05-13T13:53:31Z DBG edge discovery: returning same edge address back to pool connIndex=0 event=0 ip=198.41.200.5

Failed to serve quic connection

What is quic connection? Quic is a new encrypted transport layer network protocol that makes HTTP traffic more secure, efficient, and faster. However, there are certain ISP's do not support this protocol, which can interfere with your Cloudflare Tunneling.

There is a HIGH possibility that your ISP *doesn't support this and your tunnel will continue to have these errors. To fix this error, there are two options.

OPTION 1: If your Cloudflared Template is using the latest Repository like below.

cloudflare/cloudflared:latest

Please, add another Variable. Add a name of your liking for the Variable.

Key: TUNNEL_TRANSPORT_PROTOCOL

There are four different values you can add, "auto, http2, h2mux, and quic." (choose one)

In our case, we will be using the “http2” protocol to fix the quic connection error. ​

Once has been added. Restart the Container. The quic connection error should be resolved.

OPTION 2: In Cloudflared Template, change the Repository to:

cloudflare/cloudflared:2022.3.4

By changing this you will downgrade the Cloudflared Docker Container to 2022.3.4 version. It will use the http2 protocol but won't have the latest security patches for the tunnel. Recommend using the latest version.

mailinglists35 commented 5 months ago

still seeing this with latest release as of today. arm64 deb on ubuntu 16.04lts (yeah i know... it's an odroid c2 device, cannot easily upgrade it, but hey thanks for making the daemon compatible with it!)

cloudflared tunnel --protocol http2 --url http://localhost:8000/ working connector id: f21ac8bc-e89e-4c94-9390-ac0140b96e65

cloudflared tunnel --url http://localhost:8000/ failing connector id: 2b478358-9ac5-408d-86c4-16d63064cbe9

netcat does connect to any of target servers but it does with some delay, it does not happen instantly.

maybe the connector gives up prematurely?

here's an example:

2024-03-22T23:55:41Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: handshake did not complete in time" connIndex=0 event=0 ip=198.41.200.193
2024-03-22T23:55:41Z INF Retrying connection in up to 1m4s connIndex=0 event=0 ip=198.41.200.193
$ date; echo quit | nc -v -u -q 1 198.41.200.23 7844; date
Sat Mar 23 02:59:00 +03 2024
Connection to 198.41.200.23 7844 port [udp/*] succeeded!
Sat Mar 23 02:59:04 +03 2024
MCheping8108 commented 4 months ago

cloudflared.exe how to configure? my Windows host can't install docker

CDSFounder commented 1 week ago

Hi all, what is the latest on this? I am trying to get UDP packets to work on Cloudflare WARP (Cloudfared) and I understand that it must use the QUIC protocol for this to work.

When I force the QUIC protocol, i get the following errors in the event log: 2024-09-09T02:24:59Z INF Starting tunnel tunnelID=0d23cb9a-b621-4a2e-8ddc-d7fb1b4b0d45 2024-09-09T02:24:59Z INF Version 2024.8.3 2024-09-09T02:24:59Z INF GOOS: windows, GOVersion: go1.22.2-devel-cf, GoArch: amd64 2024-09-09T02:24:59Z INF Settings: map[loglevel:debug p:auto protocol:auto token:*****] 2024-09-09T02:24:59Z INF cloudflared will not automatically update on Windows systems. 2024-09-09T02:24:59Z INF Generated Connector ID: 643f0835-be46-4ef0-be49-34788dfb5a17 2024-09-09T02:24:59Z DBG Fetched protocol: quic 2024-09-09T02:24:59Z INF Initial protocol quic 2024-09-09T02:24:59Z INF ICMP proxy will use 10.10.11.11 as source for IPv4 2024-09-09T02:24:59Z INF ICMP proxy will use fe80::281f:1330:df8f:4ecd in zone Ethernet 3 as source for IPv6 2024-09-09T02:24:59Z DBG edge discovery: looking up edge SRV record domain=_v2-origintunneld._tcp.argotunnel.com event=0 2024-09-09T02:24:59Z DBG edge discovery: resolved edge addresses addresses=["198.41.192.67","198.41.192.7","198.41.192.77","198.41.192.107","198.41.192.227","198.41.192.47","198.41.192.27","198.41.192.57","198.41.192.167","198.41.192.37"] event=0 2024-09-09T02:24:59Z DBG edge discovery: resolved edge addresses addresses=["198.41.200.53","198.41.200.233","198.41.200.73","198.41.200.113","198.41.200.193","198.41.200.13","198.41.200.33","198.41.200.63","198.41.200.23","198.41.200.43"] event=0 2024-09-09T02:24:59Z INF Starting metrics server on 127.0.0.1:59109/metrics 2024-09-09T02:24:59Z DBG edge discovery: looking up edge SRV record domain=_v2-origintunneld._tcp.argotunnel.com event=0 2024-09-09T02:24:59Z DBG edge discovery: resolved edge addresses addresses=["198.41.192.67","198.41.192.7","198.41.192.77","198.41.192.107","198.41.192.227","198.41.192.47","198.41.192.27","198.41.192.57","198.41.192.167","198.41.192.37"] event=0 2024-09-09T02:24:59Z DBG edge discovery: resolved edge addresses addresses=["198.41.200.53","198.41.200.233","198.41.200.73","198.41.200.113","198.41.200.193","198.41.200.13","198.41.200.33","198.41.200.63","198.41.200.23","198.41.200.43"] event=0 2024-09-09T02:24:59Z DBG edge discovery: giving new address to connection connIndex=0 event=0 ip=198.41.200.13 2024-09-09T02:25:04Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: no recent network activity" connIndex=0 event=0 ip=198.41.200.13 2024-09-09T02:25:04Z DBG edge discovery: giving new address to connection available=19 connIndex=0 event=0 ip=198.41.192.37 2024-09-09T02:25:04Z INF Retrying connection in up to 2s connIndex=0 event=0 ip=198.41.200.13 2024-09-09T02:25:06Z DBG edge discovery: returning same edge address back to pool connIndex=0 event=0 ip=198.41.192.37 2024-09-09T02:25:11Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: no recent network activity" connIndex=0 event=0 ip=198.41.192.37 2024-09-09T02:25:11Z DBG edge discovery: giving new address to connection available=19 connIndex=0 event=0 ip=198.41.200.33 2024-09-09T02:25:11Z INF Retrying connection in up to 4s connIndex=0 event=0 ip=198.41.192.37 2024-09-09T02:25:14Z DBG edge discovery: returning same edge address back to pool connIndex=0 event=0 ip=198.41.200.33 2024-09-09T02:25:19Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: no recent network activity" connIndex=0 event=0 ip=198.41.200.33 2024-09-09T02:25:19Z DBG edge discovery: giving new address to connection available=19 connIndex=0 event=0 ip=198.41.200.53 2024-09-09T02:25:19Z INF Retrying connection in up to 8s connIndex=0 event=0 ip=198.41.200.33 2024-09-09T02:25:22Z DBG edge discovery: returning same edge address back to pool connIndex=0 event=0 ip=198.41.200.53 2024-09-09T02:25:27Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: no recent network activity" connIndex=0 event=0 ip=198.41.200.53 2024-09-09T02:25:27Z DBG edge discovery: giving new address to connection available=19 connIndex=0 event=0 ip=198.41.200.73 2024-09-09T02:25:27Z INF Retrying connection in up to 16s connIndex=0 event=0 ip=198.41.200.53 2024-09-09T02:25:32Z DBG edge discovery: returning same edge address back to pool connIndex=0 event=0 ip=198.41.200.73 2024-09-09T02:25:37Z ERR Failed to create new quic connection error="failed to dial to edge with quic: timeout: no recent network activity" connIndex=0 event=0 ip=198.41.200.73 2024-09-09T02:25:37Z DBG edge discovery: giving new address to connection available=19 connIndex=0 event=0 ip=198.41.200.53 2024-09-09T02:25:37Z INF Retrying connection in up to 32s connIndex=0 event=0 ip=198.41.200.73

We have all egress allowed - nothing should be blocking UDP. Any ideas what the real fix is here?