meetecho / janus-gateway

Janus WebRTC Server
https://janus.conf.meetecho.com
GNU General Public License v3.0
8.23k stars 2.48k forks source link

Janus exits if STUN server can't be reached #987

Closed mzanetti closed 7 years ago

mzanetti commented 7 years ago

If a STUN server is configured, Janus tries to connect to it on startup. If that fails, for instance because of the device isn't online just yet, Janus exits with a FATAL error.

This is problematic because the service manager (in our case systemd) keeps on respawning Janus while Janus keeps on exiting. This causes a start/stop loop causing high CPU/Resource demand and slowing down other services.

If the network connection can't be established, Janus should instead wait for a bit and just try to reconnect instead of exiting.

Here's an excerpt of the log when this happens:

Aug 29 10:46:42 loop-box snap[3904]: STUN server to use: numb.viagenie.ca:3478 Aug 29 10:46:42 loop-box snap[3904]: [ERR] [ice.c:janus_ice_set_stun_server:759] Could not resolve numb.viagenie.ca... Aug 29 10:46:42 loop-box snap[3904]: [FATAL] [janus.c:main:3725] Invalid STUN address numb.viagenie.ca:0

lminiero commented 7 years ago

I'm personally not a big fan of waiting until things exist (which might never happen), this should be up to the service setup and can be done externally. It's a fatal error and not a warning because warnings may not be seen, and then things don't work and you don't know why: a fatal error allows you to check the configuration for mistakes and fix that.

mzanetti commented 7 years ago

I see your point. But this isn't a configuration error. The configuration is valid in this case but for some reason internet connectivity is down atm. It doesn't seem sensible to me to just crash whenever the internet connection goes down for a moment. Obviously printing a warning to the log that the STUN connection failed makes sense so in case of a configuration error one could still find that in the logs which one has to read anyways, even when it exits with a fatal error.

uasan commented 7 years ago

You just need correctly to set directive After in Systemd unit file. https://www.freedesktop.org/software/systemd/man/systemd.unit.html#Before=

mzanetti commented 7 years ago

yeah, I had that thought too... "After=network-online.target" would probably work around this for this particular case. Need to figure out how to do that as we're shipping janus in an Ubuntu Snap package which autogenerates the systemd service, but I'm sure there is a way somehow.

Still just exiting on temporary failures seems wrong to me...

lminiero commented 7 years ago

Closing as there's a way to get around that via Systemd, and we have no immediate plan to change the current behaviour.

jeremyarr commented 6 years ago

@mzanetti did you end up finding a way to specify the systemd "after" parameter in your snap file? We have the exact same use case as you.

sjkummer commented 5 years ago

In time of IoT, we started using Janus in edge devices. There are use cases, where no WAN connectivity is available, but streaming inside LAN is still desirable. I'd propose a configuration flag to ignore unreachable STUN/TURN servers - and allow Janus to run even if a TURN server is (temporarily) unreachable. @lminiero Would such a PR be welcome?

jotaen4tinypilot commented 1 year ago

Just for the protocol, it might be easy to miss: this issue was resolved per https://github.com/meetecho/janus-gateway/pull/1854. There is now an ignore_unreachable_ice_server setting available.