cloudflare / cloudflared

Cloudflare Tunnel client (formerly Argo Tunnel)
https://developers.cloudflare.com/cloudflare-one/connections/connect-apps/install-and-setup/tunnel-guide
Apache License 2.0
8.88k stars 780 forks source link

🐛 Cloudflare Zero Trust Access ICMP Beta #854

Closed wind0r closed 1 year ago

wind0r commented 1 year ago

Describe the bug We are part of the ICMP Beta for Zero Auth Access. ICMP is working but it looks like Packets get modified. Max TTL is somehow capped to 3

To Reproduce Steps to reproduce the behavior:

  1. Configure Cloudflare VPN with ICMP support
  2. Run mtr to some host in your network.
  3. Check the response

Expected behavior mtr shows more than 3 entries.

Environment and versions

Logs and errors MTR Log via Cloudflare VPN:

mtr -p $HOST
1 162.158.85.95 0 1 1 7 7 7
2 10.11.20.187 0 1 1 8 8 8
3 10.40.0.0 0 1 1 37 37 37

Additional context MTR Log when started on the server where Cloudflare is running

mtr -p $HOST
1 10.40.64.2 0 1 1 12 12 12
2 10.40.64.1 0 1 1 13 13 13
3 10.40.18.62 0 1 1 12 12 12
4 10.40.0.0 0 1 1 13 13 13

MTR Log via another VPN (not cloudflared)

mtr -p $HOST
1 172.27.232.1 0 1 1 12 12 12
2 10.40.64.2 0 1 1 17 17 17
3 10.40.64.1 0 1 1 28 28 28
4 10.40.18.62 0 1 1 18 18 18
5 10.40.0.0 0 1 1 40 40 40
nmldiegues commented 1 year ago

Hello @wind0r

The MTR through Cloudflare is showing:

  1. a Cloudflare Network hop (162.158.85.95 is a machine we operate in our anycast network)
  2. the Cloudflare Tunnel hop (10.11.20.187 is the machine you operate running cloudflared daemon)
  3. the origin, as reachable from the Tunnel machine

We do not limit TTL to 3. What's happening is that your machine (operating Tunnel) is able to reach your origin within 1 hop.

The "MTR Log when started on the server where Cloudflare is running" is very odd because it does not show the IP 10.11.20.187 that we see in Cloudflare's MTR Hop #2. Maybe you are in a better position to help clarify that question since it makes it hard to compare the 1st MTR with the 2nd MTR

wind0r commented 1 year ago

Hello @nmldiegues

Thanks for the fast response, and I am sorry for the confusing post. We currently use 2 hosts with cloudflared, and I used the wrong host for the mtr example.

Both host technical can't directly reach the target directly. The VPN Hosts are some EC2 nodes. Then our AWS/Backbone peering and then some nodes/jumps within our backbone. So it shouldnt be possible reach the origin within 3 hops via VPN.

image

nmldiegues commented 1 year ago

Gotcha, that makes more sense, thanks for clarifying it to me.

In that case, I now understand how this links better to what we're providing here. The cloudflared tunnel daemon is a Layer 4 proxy, that runs purely in userland, without root/admin privileges. So when adding ICMP, we faced a challenge (actually, different ones, depending on the Operating System) as to how we could do ICMP from userland and without root/admin.

The result is that cloudflared is not truly routing the original ICMP flow (neither from downstream nor from upstream). It acts as a middle-man for 2 ICMP flows:

  1. from your WARP device -> Cloudflare Network -> cloudflared tunnel machine
  2. cloudflared tunnel machine -> path to origin -> private origin

But what you see is just 1 stitched with a "fake" representation of 2. Hence, it is expected that given our implementation, across all OS, you'll always see 1 hop only after Cloudflared Tunnel, the one to the origin. From our perspective, we've got observability on what matters: the path from the client device to the Tunnel in the private network. If you ever find yourself wondering what's happening in the last hop (that we abstract as a single hop given the limitation above), then you can always do what you did: hop on the machine in the private network and do the traceroute to the private origin.

We have a blog post coming out soon that will explain the implementation behind this. Keep an eye out for https://blog.cloudflare.com/

As for this bug report: it's expected, by design/nature of the fact that cloudflared does route the original ICMP flow, and rather creates a new one, which we then "fake/stitch" into the original one, all of it so that we can avoid requiring root/admin.

Hence, I'll be marking this as not a bug, but feel free to keep discussing!