bigbluebutton / bbb-install

BASH script to install BigBlueButton in 30 minutes.
GNU Lesser General Public License v3.0
617 stars 538 forks source link

Hairpin NAT not working from other hosts behind same nat - cause: public ip on lo interface #133

Open pnks opened 4 years ago

pnks commented 4 years ago

I just spend quite a few days debugging a setup where the bbb-server I set up with the install script is behind a NAT which has the correct ports forwarded (+ssh). Before installing BBB I could ssh the host using the public IP address from within my NAT network (hairpin NAT) After installing BBB I could no longer access the server from within my network - while hairpin NAT always worked fine.

After a re-install I finally discovered the cause: By default the install script sets up a dummy interface for the public IP when setup behind a NAT, as described in https://docs.bigbluebutton.org/2.2/configure-firewall.html#configure-a-dummy-nic-if-required

Unfortunately, this prevents the server from responding to requests from other hosts using hairpin NAT.

In my opinion this should be an optional, and not forced for all NAT'ed servers, since hairpin NAT is supported in many routers

Small gotcha: The freeswitch server is setup to listen to requests on this ip. I had to change the interface it listens on.

rottaran commented 4 years ago

The dummy NIC seems to be a workaround when the NAT doesn't provide hairpinning according to https://docs.bigbluebutton.org/2.2/configure-firewall.html. I have this problem for example with lxc/libvirt based containers on CentOS 8.

Unfortunately, just changing the binding to the internal IP in freeswitch was not sufficient in my case. I did the following in order to get rid of the dummy NIC on loopback. Discussion of the reasons below.

ip addr del EXTERNALIP/32 dev lo

sed -i 's*https://EXTERNALIP:7443*https://INTERNALIP:7443*g' /etc/bigbluebutton/nginx/sip.nginx

sed -i '/<\/network-lists>/i \
    <list name="localnet.auto" default="allow"> \
      <node type="deny" cidr="INTERNALIP/32"/> \
    </list>' /opt/freeswitch/conf/autoload_configs/acl.conf.xml

sed -i '/param name="ws-binding"/ s/":5066"/"127.0.0.1:5066"/' /opt/freeswitch/conf/sip_profiles/external.xml 

sed -i '/param name="wss-binding"/ s/EXTERNALIP/INTERNALIP/' /opt/freeswitch/conf/sip_profiles/external.xml 

sed -i '/freeswitch:/!b;n;s/EXTERNALIP/127.0.0.1/' /usr/local/bigbluebutton/bbb-webrtc-sfu/config/default.yml 

sed -i 's/server_name EXTERNALNAME;/server_name INTERNALIP EXTERNALNAME/g' /etc/nginx/sites-available/bigbluebutton

sed -i 's/;externalAddress=10.20.30.40/externalAddress=EXTERNALIP/g' /etc/kurento/modules/kurento/WebRtcEndpoint.conf.ini

The connection from sip.nginx to the freeswitch websocket has to go through encryption, that is wss on port 7443. An encrypted websocket connection that is terminated at nginx cannot be forwarded to the unencrypted websocket port 5066 of freeswitch because freeswitch will notice that the encryption was present at client side but is missing at server side and reject the connection.

freeswitch tells the client, which IP address has to be used for the connection from client to server. This is the external_rtp_ip in vars.xml. However, for connections that are set up via the websocket port, the IP address of the port is taken into account to differentiate between internal and external connections. Just the external connections get external_rtp_ip. The connection from nginx to the local IP looks like internal and, hence, freeswitch tells the wrong server IP to the client. The solution is, to exclude the server's internal IP address from the local-net ACL.

Now the connection from the webrtc-sfu is broken, because it uses the websocket at INTERNALIP:5066. Hence, I changed the sfu config to connect via localhost and I told freeswitch to bind ws to localhost:5066 instead of just ":5066"