slackhq / nebula

A scalable overlay networking tool with a focus on performance, simplicity and security
MIT License
14.6k stars 981 forks source link

Synology: runs, but no incoming or outgoing connections (route does not exist) #256

Open jdk opened 4 years ago

jdk commented 4 years ago

I can successfully run the app on synology hardware without any issue. I get some log output, but no errors, even when debug level logging is enabled.

I can see the nebula interface is up in ifconfig. I usually see 500 Bytes of RX packets, but nothing on the TX side.

nebula1   Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  
          inet addr:10.1.18.6  P-t-P:10.1.18.6  Mask:255.255.255.0
          UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:1300  Metric:1
          RX packets:11 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:500 
          RX bytes:692 (692.0 B)  TX bytes:0 (0.0 B)

There are no firewall rules. Here is that portion of the configuration:

# since everything is default deny, all rules you
# actually SPECIFY here are allow rules.
#
  outbound:
    - port: any
      proto: any
      host: any

  inbound:
    - port: any
      proto: any
      host: any

I can not ssh into the synology using the 10.1.18.6 address, nor can I reach a 10.1.18.x address from the synology. I can however do both of these using the non nebula IP addresses.

Let me know what else I can provide that would be helpful.

rawdigits commented 4 years ago

I'm running nebula on a synology, but don't recall any specific issues. Can you share the log output you're seeing?

rawdigits commented 4 years ago

ahhh i found something i'm doing in my startup script that you should try, which is adding the route myself. give this a shot:

route add -net 10.1.18.0/24 dev nebula1

jdk commented 4 years ago

That did the trick. I had a hell of a time trying to get this to start on boot. I tried to do it cleanly and it wouldn't budge. Starting up falsely without creating the interface device, creating the route would happen too fast before the interface was enabled, or the route was created and then lost a second later.

So I brute force it. I start nebula, wait five seconds, see if the interface is there, repeat. Then I try to create the route, wait five seconds, see if it's still there, repeat. I have timeouts on each. I'll post this hoping it's useful.

jdk commented 4 years ago

Again, I don't like to brute force anything, but for some reason, every time I tried to do a proper conf file or init script, as per synology's documentation, one part of the script would fail erroneously, miss a dependency, give a false start, etc. I got a lot more failures if it's run at boot then if I run the same script after boot. I tried a number of different ways, and gave up. Consistently working, so I hope it'll save someone else from losing time until there is an official package.

Go to Synology's web portal, Control Panel, Task Scheduler.

Create a new Triggered Task, User-Defined Script.

Set the user to: root. Set the Event to: "Boot Up" The task setting should run the following command: /usr/local/etc/rc.d/nebula.sh start

Create another Triggerd Task, User-Defined Script. Set the user to root. Set the Event to: "Boot Up" The task setting should run the following command: /usr/local/etc/rc.d/nebula.sh stop

Now create the script and put it in /usr/local/etc/rc.d/nebula.sh

#!/bin/bash

SCRIPT="PATH_TO_CONFIG" # example /volume1/nebula/config.yml
SUBNET="10.1.0.0/24"

PIDFILE="PATH_TO_PID" # example - /volume1/nebula/nebula.pid
LOGFILE="PATH_TO_LOG" # example - /volume1/nebula/nebula.log

status() {
  if [ -f $PIDFILE ]; then
    echo 'Service running' >&2
    return 1
  fi
}

start() {
  if [ -f $PIDFILE ] && kill -0 $(cat $PIDFILE); then
    echo 'Service already running' >&2
    return 1
  fi
  printf 'Starting nebula service...' >&2
  "$SCRIPT" &> "$LOGFILE" & echo $! > "$PIDFILE"
  sleep 5

  NEXT_WAIT_TIME=0
  until [ $NEXT_WAIT_TIME -eq 120 ] || [ ! -z "$(ifconfig | grep nebula1)" ]; do
    printf 'Failed\n'
    printf 'Starting nebula service...' >&2
    $SCRIPT &> "$LOGFILE" & echo $! > "$PIDFILE"
    sleep 5
    ((NEXT_WAIT_TIME++))
  done

  if [ "$NEXT_WAIT_TIME" -ge "120" ]; then
    printf "Failed\n"
    exit 1
  else
    printf "Success!\n"
  fi

  printf "Adding route..."
  route add -net "$SUBNET" dev nebula1
  sleep 5

  NEXT_WAIT_TIME=0
  until [ $NEXT_WAIT_TIME -eq 15 ] || [ ! -z "$(route | grep nebula1)" ]; do
    printf "Failed\n"
    printf "Adding route..."
    route add -net "$SUBNET" dev nebula1
    sleep 5
    ((NEXT_WAIT_TIME++))
  done

  if [ "$NEXT_WAIT_TIME" -ge "15" ]; then
    printf "Failed\n"
    exit 1
  else
    printf "Success!\n"
  fi
}

stop() {
  route del -net "$SUBNET" dev nebula1
  if [ ! -f "$PIDFILE" ] || ! kill -0 $(cat "$PIDFILE"); then
    echo 'Service not running' >&2
    return 1
  fi
  echo 'Stopping nebula service' >&2
  kill -15 $(cat "$PIDFILE") && rm -f "$PIDFILE"
  echo 'Service stopped' >&2
}

case "$1" in
  start)
    start
    ;;
  stop)
    stop
    ;;
  status)
    status
    ;;
  restart)
    stop
    start
    ;;
  *)
    echo "Usage: $0 {start|stop|restart}"
esac

make sure to set the script as executable:

chmod +x  /usr/local/etc/rc.d/nebula.sh
rawdigits commented 3 years ago

Any chance you would want to add this snippet of code into a PR we can put into the 'examples' section of the repo? I just got a new synology and plan to use your script going forward.

jdk commented 3 years ago

Yeah, definitely. I’ll do that today.

rawdigits commented 3 years ago

My suggestion would be to make an examples/synology directory, and add the script there. Readme.md optional, but may be useful to note setting executable bit and notes on why we have to do the weird route creation brute force thing.

jdk commented 3 years ago

welp, 25 days past like it was an hour. I'm so sorry about that. #357 was created to add an example and instructions.

adamphetamine commented 3 years ago

Hi @jdk, This script requires setting a path for a log file, but when I set it to my /nebula folder I get 'no such path or directory'. Am I setting where I want the log file to be (in which case it's a permissions issue), or do I need to specify where the log exists? (In which case I have no idea and need help)

jdk commented 3 years ago

Hi @jdk, This script requires setting a path for a log file, but when I set it to my /nebula folder I get 'no such path or directory'. Am I setting where I want the log file to be (in which case it's a permissions issue), or do I need to specify where the log exists? (In which case I have no idea and need help)

Can you post that section of your config? My guess is you have a space, so put it all in quotes.

adamphetamine commented 3 years ago

Hi @jdk Thanks for the quick response- I found it Needed to make the lines read- PIDFILE=/var/run/nebula.pid # example - /volume1/nebula/nebula.pid LOGFILE=/var/log/nebula.log # example - /volume1/nebula/nebula.log

Nebula is still failing to start but that's a new issue for me to figure out, thanks! Need to look at the logs!

johnmaguire commented 1 year ago

Copying from dupe #484:

I'm running nebula in a Docker container on a Synology NAS, and there's an issue where the nebula route shows up in ip route as expected, but it disappears permanently about 1 second after the container is started, though nebula continues running. Because the route is gone, I can't connect to other nebula IPs unless I manually – or programmatically – add the route back.

I used Delve to step through nebula.(*Interface).activate(), and the issue still happens. However, strangely enough, if i step slowly through nebula.Tun.Activate() in tun_linux.go, the route goes up at line 253, the mtu is set at line 276, and then it never disappears. More specifically, the route disappears if I set a breakpoint at /build/tun_linux.go:254 and continue, but not at /build/tun_linux.go:253.

Adding time.Sleep(2000*time.Millisecond) right before line 253 completely fixes the problem, and the route stays up consistently (but not if I use 1500 ms).

I'm curious if anyone else has encountered this or knows why/how it could happen.

I'm running: Synology DSM 6.2.4-25556 Synology's Docker package, version 18.09.0-0519 This Dockerfile This docker-compose.yaml Partial config.yaml:

listen:
  host: "[::]"
  port: 0
punchy: true
punch_back: true
tun:
  dev: nebula1
  drop_local_broadcast: false
  drop_multicast: false
  tx_queue: 500
  mtu: 1300
  routes:

Edit: I see this is related to #256 - however, I wanted to run nebula as a static binary from Docker's scratch image, so I needed a binary fix rather than a script fix. Perhaps others may find this useful as well.

alkahan commented 1 year ago

I have partially fixed the problem by excluding eth0 interface

  local_allow_list:
    # Example to block tun0 and all docker interfaces.
    interfaces:
      'br-.*': false
      'docker.*': false
      'eth0': false

it is not working all the time : if I stop and start again quickly, the route is not up.