toverainc / willow-inference-server

Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS
Apache License 2.0
387 stars 35 forks source link

Use traefik #4

Closed kristiankielhofner closed 1 year ago

kristiankielhofner commented 1 year ago

Need to make a docker-compose for this project with Traefik to get a legit certificate from LE

kristiankielhofner commented 1 year ago

Biggest issue implementing traefik is controlling ephemeral/used port ranges for media, ICE, etc with aiortc:

https://github.com/aiortc/aioice/pull/63

Until this is implemented we will have to use --net host with Docker :(

Getting a cert from LE with the current approach is virtually impossible but if we could front with Traefik today we could get valid cert (absolutely required for SpeechMike support because HID). To make matters worse, we will need to be able to pass the real host IP to aiortc so it can generate coherent ICE candidates with the real public IP instead of the docker IP it sees.

We will need to do this eventually for compatibility with a firewall on our side.

What a mess - stuff like this is why I tried to leave VoIP.

kristiankielhofner commented 1 year ago

I have a good chunk of a solution with this - from an HTTP standpoint I'm going to trial the use of Cloudflare tunnels. With an eye towards HIPAA, SOC 2, etc they'll enable all kinds of interesting approaches in that area:

https://infer.tovera.io/rtc

That takes care of SSL (and we may still use Traefik in between CF and the API endpoint) but we still have the ephemeral port range for media issue. We can probably limit this within aiortc but I'm also curious about potentially solving this with the use of an LD_PRELOAD shim to fake out the ephemeral port range to a range we specify.

We'll likely have to have aggressive firewalls all over the place and I get the feeling it could be useful outside of aiortc. Or maybe not - just an idea.