slackhq / nebula

A scalable overlay networking tool with a focus on performance, simplicity and security
MIT License
14.38k stars 969 forks source link

Feature request: Support UDP/TCP port fowarding to a host without setting up a tun #1014

Open johnmaguire opened 10 months ago

johnmaguire commented 10 months ago

Allow binding another Nebula node's port as a local port on the machine. This would allow tun-less access to that host/port combo.

Discussed in https://github.com/slackhq/nebula/discussions/1011

Originally posted by **nick008a** November 12, 2023 Feature request : support UDP/TCP port fowarding without setting up a tun This will allow nebula to work without root, and one could further set up a socks proxy with this to enable #915
Originally posted by **nick008a** November 12, 2023 i was looking for something like ngrok which allows you tunnel a port to a remote machine (a local machine can access 127.0.0.1:8080 to talk to, say, 127.0.0.1:8081 on a remote machine without setting up a tun)
aa51513 commented 10 months ago

I'm also looking forward to this feature.

sybrensa commented 9 months ago

Yes, something like this would be great! I'm currently using SSH for this, but still need a working tun adapter or direct connection.

cre4ture commented 2 months ago

I'm also interested in this. I would have some time to implement it if its not yet in progress by someone else. Even tough I'm new to golang, I guess I understand the basic principles and should be able to find a way.

@johnmaguire do you know if currently someone is working on the topic?

sybrensa commented 2 months ago

I'm also interested in this. I would have some time to implement it if its not yet in progress by someone else. Even tough I'm new to golang, I guess I understand the basic principles and should be able to find a way.

@johnmaguire do you know if currently someone is working on the topic?

I don't think I can contribute anything to the programming, but if you need somebody to test or write docs/readme or something I'd be happy to help!

cre4ture commented 2 months ago

@sybrensa Thanks for offering support. I'll let you know as soon as there is something halfway stable. ;-)

I started already with doing some first tries:

  1. UDP forwarding with manually creating and parsing the UDP headers+checksums This basically works already, but I think with limited functionality. There may only be one client to the UDP server port.
  2. I started with TCP the same way. But I quickly figured out that the TCP-Stack is much more complex than the UDP one such that a manual implementation of it just doesn't make sense. 2.1. Custom TCP stack: Too complex. 2.2. Use of Wireguard stack implementation: Documentation totally missing. No Way. 2.3. Use gvisor (google/netstack): Seems feasible. i started but then by chance figured out that this stack is already used in service.go. According to the PR #965 this was added for the case where the nebula "endpoint" shall be used directly from within a seperate application such that there is no need for a TUN-Device and such no need for root rights.

This PR #965 seems to be the perfect preparation for this feature. My next try will therefor be based on this.

By the way, I hope nobody else is already working on the topic. @nbrownus do you know who might know this?

cre4ture commented 2 months ago

@sybrensa pls have a look to my PR #1179. Its in a shape such that it can be tested I think. I personally only did some netcat based manual tests till now. But I will also extend it to some real applications running on it.

sybrensa commented 2 months ago

@sybrensa pls have a look to my PR #1179. Its in a shape such that it can be tested I think. I personally only did some netcat based manual tests till now. But I will also extend it to some real applications running on it.

Hey, I'm looking at it right now. From what I can tell from the provided config example, do I need to configure nebula clients on both ends to expose ends to make this works, or can it be used similar to the "unsafe_routes" or SSH tunnel too? (Basically only specifying on one client to expose localhost:1234 to 192.168.100.60:80, or even export localhost:1234 to :80 VIA 192.168.100.60?

I'll try at least with some real scenarions where live video can be streamed over the tunnel using TCP.

cre4ture commented 2 months ago

... or can it be used similar to the "unsafe_routes" or SSH tunnel too? (Basically only specifying on one client to expose localhost:1234 to 192.168.100.60:80, or even export localhost:1234 to :80 VIA 192.168.100.60?

I designed it with the ssh tunnels in mind. So you should be able to use "unsafe_routes" when the config file allows it for the normal usecase. But I didn't test this till now.

sybrensa commented 2 months ago

... or can it be used similar to the "unsafe_routes" or SSH tunnel too? (Basically only specifying on one client to expose localhost:1234 to 192.168.100.60:80, or even export localhost:1234 to :80 VIA 192.168.100.60?

I designed it with the ssh tunnels in mind. So you should be able to use "unsafe_routes" when the config file allows it for the normal usecase. But I didn't test this till now.

Alright I'll give it a try. I can already confirm that an RTSP video stream seems to work well so far (10 minutes and counting). So there's the first real application for you! :-)

I did notice when forwarding the webpage from the device nebula is running on, the logs get flooded with these messages: image

It doesn't seem to impact the performance at all, just curious why this happens.

cre4ture commented 2 months ago

its nice to hear that it performs already for some officially tested real usecases. Thanks for the feedback :-) I think I need to tune the log-output more. Because the "EOF"s are normal for a terminating tcp connection. That's for sure not an issue. The other logs ("use of closed ...") could be related to this as there are always two goroutines per connection (forwarding outgoing and ingoing data). But I will try to reproduce this and double check it.

sybrensa commented 2 months ago

I tried to get it working with an unsafe_route, but didn't succeed yet. Perhaps I'm doing it wrong. It's good to know I've never used the unsafe_routes before, so it might be a wrong configuration elsewhere.

I've created a cert with a subnet 10.115.2.0/23 and ran it on client A

On client B, I've set up the following config:

tun:
  user: true
  disabled: false
  dev: nebula1
  drop_local_broadcast: false
  drop_multicast: false
  tx_queue: 500
  mtu: 1300
  unsafe_routes:
    - route: 10.115.2.0/23
      via: 100.104.2.2

port_tunnel:
  outgoing:
    udp:
    # format of local and remote address: <host/ip>:<port>
    #- local_address: 127.0.0.1:3399
    #  remote_address: 192.168.100.92:4499
    tcp:
    # format of local and remote address: <host/ip>:<port>
    - local_address: 127.0.0.1:3399
      remote_address: 100.104.8.2:1554
    - local_address: 127.0.0.1:4499
      remote_address: 10.115.3.55:80

When trying to access localhost:4499, nothing happens, but the logs on client B say the following: INFO[0042] Handshake timed out durationNs=6774611742 handshake="map[stage:1 style:ix_psk0]" initiatorIndex=4255603315 localIndex=4255603315 remoteIndex=0 udpAddrs="[]" vpnIp=10.115.3.55

cre4ture commented 2 months ago

It's sad that the unsafe_routes didn't work. I also never used this feature so far. I can't detect anything wrong on what you did.

Meanwhile I reduced the log-level for the logs that report the connection termination. It seems that some tools do not properly close the TCP-IP connection. E.g. close the receiver side when the sender is still sending data. Thats a bit weird. But as it apparently doesn't lead to any practical issues, I decided to ignore it for now.

cre4ture commented 2 months ago

Update regarding the "unsafe_routes": I tested myself and It also didn't work for me. I then checked the source-code more in detail. I came to the conclusion that the mechanism that installs the foreign routes is not active for the "tun.user=true" case. Only the "real" tun, and only on linux seems to support this. I will spend time to figure out if it would be possible to make te unsafe_routes also available for the user-tun. But this should be handled in a seperate PR I guess.

sybrensa commented 2 months ago

Update regarding the "unsafe_routes": I tested myself and It also didn't work for me. I then checked the source-code more in detail. I came to the conclusion that the mechanism that installs the foreign routes is not active for the "tun.user=true" case. Only the "real" tun, and only on linux seems to support this. I will spend time to figure out if it would be possible to make te unsafe_routes also available for the user-tun. But this should be handled in a seperate PR I guess.

Ah that explains. In theory you'd say that if Nebula allows you to route packets over the network, it should be able to do the same with the user tun except use the local port binding instead of the route. The principle doesn't really change. But perhaps you're right about putting it in a separate PR.

cre4ture commented 2 months ago

@sybrensa I have good news. I could make the unsafe_routes running on the user.tun. For this I added code that I found for the real tun also for the user-tun. I did't push this to any PR yet. There is only this feature branch: https://github.com/cre4ture/nebula/tree/feature/enable_unsafe_routes_for_user_tun

But be warned: Routing is a topic that I never had to deal with directly before. I'm not fully sure if there might be side-effects.

Let me know if this also works for you.

sybrensa commented 2 months ago

@cre4ture I've compiled the branch you mentioned, but haven't been able to get it work. To be sure it wasn't something I missed, made sure the regular unsafe_routes are working like they should. Now when I add the config as I mentioned 3 days ago, I don't get a response from local port 4499.

In short: it works when I re-enable the TUN device and use the route as specified in unsafe_routes, it doesn't work when I use tun.user=true and set the outgoing port_forwarding remote address to a device in the unsafe route range.

Can you maybe share the config you used?

cre4ture commented 2 months ago

Can you maybe share the config you used?

I followed the instructions from this page: https://nebula.defined.net/docs/guides/unsafe_routes/ I asume that you did the same.

I use my raspery-pi (pinas) that is connected at home to the network (192.168.178.41/24) as the "Linux host on LAN that will handle routing". nebula-cert sign -name 'pinas' -ip '192.168.100.41/24' -subnets '192.168.178.0/24' I followed the steps 3,4 and 5 to prepare my "pinas" linux host.

On my laptop, I added this to the "tun" section:

tun:
  [...]
  unsafe_routes:
    - route: 192.168.178.0/24
      via: 192.168.100.41

And I put this into the port_forwarding section (consider changed terminology in the new version!) which forwards a local port to my QNAP-NAS web-interface (192.168.178.23:4443):

port_forwarding:
  [...]
  outgoing:
    tcp:
    # format of local and remote address: <host/ip>:<port>
    - local_address: 127.0.0.1:4443
      remote_address: 192.168.178.23:4443

Then I disable the HOME-WLAN on my laptop and my mobilephone, connect my laptop to the mobile-phone with the personal-hotspot feature. This way I force my laptop into a foreign network/mobile internet-connection.

I do a cross-check ensuring that I can't ping my QNAP-NAS:

uli@hp13-ulix:~$ ping 192.168.178.23
PING 192.168.178.23 (192.168.178.23) 56(84) bytes of data.
^C
--- 192.168.178.23 ping statistics ---
7 packets transmitted, 0 received, 100% packet loss, time 6137ms

Then I open the browser and enter this: https://localhost:4443 and after approval of the self-signed certificate it shows me the web-interface of my QNAP NAS.

I hope this helps. If I need to add something, please tell me.

cre4ture commented 2 months ago

By the way, I think there are more ways to achieve a similar behaviour like "unsafe_routes" by doing port forwarding with other tools. Here is a list: https://serverfault.com/questions/252150/port-forwarding-on-linux-without-iptables If one uses this on the routing-host instead of the nebula-foreign routes, one can also access ports outside of the nebula network.

cre4ture commented 1 month ago

@sybrensa I did some changes on the PR (#1179). Performance improvements and review findings from another non-maintainer tester like you. If you have time, you could do another testing round. I would be glad. :-)

sybrensa commented 1 month ago

@cre4ture sorry to keep you waiting, I was away this weekend. I've recompiled the binaries from the feature/try_with_gvisor_stack branch and I'm still running into the same problem as last time. It's probably something I'm doing wrong, as I'm trying to do the exact same thing you did with your pinas example.

My config looks like this:

tun:
  disabled: true
  dev: nebula1
  drop_local_broadcast: false
  drop_multicast: false
  tx_queue: 500
  mtu: 1300
  unsafe_routes:
    - route: 10.115.2.0/23
      via: 100.104.0.20

port_forwarding:
  outbound:
    - listen_address: 127.0.0.1:3399
      dial_address: 10.115.3.170:80
      protocols: [tcp]

The local port 3399 is bound, but accessing it doesn't work and eventually times out. The logs show the following: DEBU[0129] Closed TCP client connection 127.0.0.1:3399. Err: connect tcp 10.115.3.170:80: operation timed out

When I enable the tun by setting tun.disabled: false, the unsafe route is added as expected and I can access 10.115.3.170 locally without a problem.

Not sure what else I can do to solve this, any ideas?

cre4ture commented 1 month ago

@sybrensa oh.. sorry. There was a misunderstanding on my side. I forgot that there is this important usecase about the unsafe_routes on your side. I was considering that you want to do some regular testing of the "normal" port forwarding features. So one of the reasons why the usafe_routes didn't work for you with the latest version of the PR is for sure that I didn't include the changes needed for unsafe_routes. BUT, I changed thich just right now. I added the unsafe_routes fix for the user-tun now to the PR. I tested again on my side and it works with this extension. Please re-test with the latest changes. Your configuration seems to be fine for me.

sybrensa commented 1 month ago

@cre4ture thanks, that seems to have done the trick indeed! I'll also test the regular port forwarding, but needed to start somewhere yesterday ;-)

I'll let you know my findings soon!

johnmaguire commented 1 month ago

@aa51513 Please avoid commenting on issues with questions like "any updates?" or "when will this be available?" When you comment on a Github issue it pings everyone subscribed to it. Please show your support by adding an emoji reaction (e.g. ❤️ or 👍) on the initial post. We can use emoji reactions to sort issues by popularity.

Any updates will appear in this thread (such as the linked PR #1179 by @cre4ture) which you are now subscribed to as well. In the future you can subscribe on the right-hand side of the page w/o leaving a comment. Thank you!

saket424 commented 8 hours ago

@cre4ture Is it possible to specify a port range for the forwarding instead of individual UDP/TCP ports ?

cre4ture commented 3 hours ago

@cre4ture Is it possible to specify a port range for the forwarding instead of individual UDP/TCP ports ?

The implementation prepared in my PR #1179 does not allow specification of a port range for forwarding.

Depending on how this shall be implemented it might be less or more additional effort. E.g. currently we have a pair of go-routines for each port forwarding. This might not be very efficient if for example in an extreme case one forwards 1000 or even more ports at once.

But if we talk about a range of ~50 or so, this might be OK performance wise. Then it could be done rather quickly. Disclaimer: I do not yet have much experience with go-lang. Maybe I'm wrong here.