toverainc / willow-inference-server

Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS
Apache License 2.0
373 stars 33 forks source link

Unable to connect to WIS RTC demo Locally #61

Closed nikito closed 1 year ago

nikito commented 1 year ago

Hello, Managed to spin up an instance of WIS, and everything appears to be starting up correctly. However when I try to access the RTC page, it seems to try to make a connection but does not get any further. Here is the log data I can see: iceConnectionLog disconnected signalingLog complete iceConnectionLog checking localDescription offer { "type": "offer", "sdp": "v=0\r\no=- 2102021303406714890 2 IN IP4 127.0.0.1\r\ns=-\r\nt=0 0\r\na=group:BUNDLE 0 1\r\na=extmap-allow-mixed\r\na=msid-semantic: WMS\r\nm=audio 64444 UDP/TLS/RTP/SAVPF 111 63 9 0 8 13 110 126\r\nc=IN IP4 redacted\r\na=rtcp:9 IN IP4 0.0.0.0\r\na=candidate:2850786284 1 udp 2122260223 192.168.17.1 64442 typ host generation 0 network-id 1\r\na=candidate:775075476 1 udp 2122194687 192.168.84.1 64443 typ host generation 0 network-id 3\r\na=candidate:3114883702 1 udp 2122129151 192.168.1.73 64444 typ host generation 0 network-id 2 network-cost 10\r\na=candidate:1655339575 1 udp 1685921535 redacted 64444 typ srflx raddr 192.168.1.73 rport 64444 generation 0 network-id 2 network-cost 10\r\na=candidate:3609487732 1 tcp 1518280447 192.168.17.1 9 typ host tcptype active generation 0 network-id 1\r\na=candidate:1358779404 1 tcp 1518214911 192.168.84.1 9 typ host tcptype active generation 0 network-id 3\r\na=candidate:3345397998 1 tcp 1518149375 192.168.1.73 9 typ host tcptype active generation 0 network-id 2 network-cost 10\r\na=ice-ufrag:3Tc7\r\na=ice-pwd:wcTlKOSBA2XDTkqkNxd/7hJv\r\na=ice-options:trickle\r\na=fingerprint:sha-256 45:DB:A6:1A:5D:72:A9:76:1A:F4:66:84:67:F2:39:A8:F4:D8:0A:C9:98:E1:A2:6F:C5:6B:FC:D2:B2:F8:16:1D\r\na=setup:actpass\r\na=mid:0\r\na=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level\r\na=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time\r\na=extmap:3 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01\r\na=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid\r\na=sendrecv\r\na=msid:- 568268e5-89b2-4660-a7b5-2b7de2060848\r\na=rtcp-mux\r\na=rtpmap:111 opus/48000/2\r\na=rtcp-fb:111 transport-cc\r\na=fmtp:111 minptime=10;useinbandfec=1\r\na=rtpmap:63 red/48000/2\r\na=fmtp:63 111/111\r\na=rtpmap:9 G722/8000\r\na=rtpmap:0 PCMU/8000\r\na=rtpmap:8 PCMA/8000\r\na=rtpmap:13 CN/8000\r\na=rtpmap:110 telephone-event/48000\r\na=rtpmap:126 telephone-event/8000\r\na=ssrc:3526239653 cname:Kwlyo7DLJDeRJm5R\r\na=ssrc:3526239653 msid:- 568268e5-89b2-4660-a7b5-2b7de2060848\r\nm=application 64447 UDP/DTLS/SCTP webrtc-datachannel\r\nc=IN IP4 redacted\r\na=candidate:2850786284 1 udp 2122260223 192.168.17.1 64445 typ host generation 0 network-id 1\r\na=candidate:775075476 1 udp 2122194687 192.168.84.1 64446 typ host generation 0 network-id 3\r\na=candidate:3114883702 1 udp 2122129151 192.168.1.73 64447 typ host generation 0 network-id 2 network-cost 10\r\na=candidate:1655339575 1 udp 1685921535 redacted 64447 typ srflx raddr 192.168.1.73 rport 64447 generation 0 network-id 2 network-cost 10\r\na=candidate:3609487732 1 tcp 1518280447 192.168.17.1 9 typ host tcptype active generation 0 network-id 1\r\na=candidate:1358779404 1 tcp 1518214911 192.168.84.1 9 typ host tcptype active generation 0 network-id 3\r\na=candidate:3345397998 1 tcp 1518149375 192.168.1.73 9 typ host tcptype active generation 0 network-id 2 network-cost 10\r\na=ice-ufrag:3Tc7\r\na=ice-pwd:wcTlKOSBA2XDTkqkNxd/7hJv\r\na=ice-options:trickle\r\na=fingerprint:sha-256 45:DB:A6:1A:5D:72:A9:76:1A:F4:66:84:67:F2:39:A8:F4:D8:0A:C9:98:E1:A2:6F:C5:6B:FC:D2:B2:F8:16:1D\r\na=setup:actpass\r\na=mid:1\r\na=sctp-port:5000\r\na=max-message-size:262144\r\n" } iceGatheringLog complete iceGatheringLog gathering signalingLog new added track to peer connection

When I open the debug console, I see this response from asr: { "sdp": "v=0\r\no=- 3893853050 3893853050 IN IP4 0.0.0.0\r\ns=-\r\nt=0 0\r\na=group:BUNDLE 0 1\r\na=msid-semantic:WMS *\r\nm=audio 10035 UDP/TLS/RTP/SAVPF 111\r\nc=IN IP4 172.17.0.2\r\na=recvonly\r\na=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level\r\na=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid\r\na=mid:0\r\na=msid:36c53f4f-8f1f-458c-a189-9c1747570917 33ffaf28-0aeb-4bec-9886-fdb0d7e452c3\r\na=rtcp:9 IN IP4 0.0.0.0\r\na=rtcp-mux\r\na=ssrc:2402540680 cname:5246c902-d951-4bbd-b3ab-2f4ade706e73\r\na=rtpmap:111 opus/48000/2\r\na=candidate:9333c84bcc1b0bf56713df9036e6b4d9 1 udp 2130706431 172.17.0.2 10035 typ host\r\na=candidate:c58f5770074e5a6227e87732712d9300 1 udp 1694498815 redacted 10035 typ srflx raddr 172.17.0.2 rport 10035\r\na=end-of-candidates\r\na=ice-ufrag:qcEp\r\na=ice-pwd:IQ2ymsMj77ZqsnuWS6As6h\r\na=fingerprint:sha-256 6F:B3:6F:97:62:8A:7B:01:8D:A9:4E:0D:D5:B4:D9:E0:B4:99:97:DF:85:53:BA:B5:AE:07:54:5C:1F:BB:4C:32\r\na=setup:active\r\nm=application 10035 UDP/DTLS/SCTP webrtc-datachannel\r\nc=IN IP4 172.17.0.2\r\na=mid:1\r\na=sctp-port:5000\r\na=max-message-size:65536\r\na=candidate:9333c84bcc1b0bf56713df9036e6b4d9 1 udp 2130706431 172.17.0.2 10035 typ host\r\na=candidate:c58f5770074e5a6227e87732712d9300 1 udp 1694498815 redacted 10035 typ srflx raddr 172.17.0.2 rport 10035\r\na=end-of-candidates\r\na=ice-ufrag:qcEp\r\na=ice-pwd:IQ2ymsMj77ZqsnuWS6As6h\r\na=fingerprint:sha-256 6F:B3:6F:97:62:8A:7B:01:8D:A9:4E:0D:D5:B4:D9:E0:B4:99:97:DF:85:53:BA:B5:AE:07:54:5C:1F:BB:4C:32\r\na=setup:active\r\n", "type": "answer" }

Let me know if any other info would be helpful here, thanks! :)

kristiankielhofner commented 1 year ago

WebRTC can be challenging (I've been dealing with it since pre-standard)...

Are you using Chrome? We have a known issue negotiation issue with Safari in IPv4+IPv6 scenarios and Firefox is somewhat known (unfortunately) to kind of be all over the place with WebRTC.

To exclude any fundamental network connectivity issues I'd try Chrome. If it doesn't work there you likely have some network stuff going on we can explore.

nikito commented 1 year ago

I am indeed using Chrome. Also tried incognito mode just in case that'd make a difference, but no good unfortunately. If it helps/matters, I am running this in a Proxmox VM (Ubuntu OS), with the GPU shared directly with the VM. I confirmed via nvidia-smi that the GPU is present and models are loaded, and am not getting any errors on that front. I was thinking maybe it has something to do with the virtual network docker makes, in conjunction with the virtual network adapters of the VM?

Also notice when I run the app, it creates an additional virtual network adapter, but it appears to only be IPv6: image

Not sure if that could be related?

kristiankielhofner commented 1 year ago

Not having an IPv4 address looks pretty suspect and is likely to confuse our client, the browser, or WIS as we haven't tested with that specific configuration. I have no idea what it would do.

Also, based on the SDP from WIS it seems WIS is offering UDP 10035. Double and triple check that is what is actually getting passed to the docker run command as a forwarded/exposed port. WIS should maintain consistent port management but clearly something is up here so another thing to check. You need to make sure you have clean IPv4 (optionally IPv6+IPv4) UDP 10000-10050 open to the WIS instance:

https://github.com/toverainc/willow-inference-server/blob/98b31b62420bfa24a60d752e76de6dcec57aa0d3/main.py#L119

Does our hosted example work for you?

nikito commented 1 year ago

I haven't modified any of the run files so I imagine that should be what docker is using. I'll look into the ipv6 piece and see if I can find any leads there. I also have no firewall on this device so shouldn't be blocked in that regard.

And yes, hosted example works fine. :)

nikito commented 1 year ago

SO can't really find anything on the ipv6 address side, a bit stuck there. Is there a way to turn on more verbose logging perhaps?

kristiankielhofner commented 1 year ago

You can uncomment this line:

https://github.com/toverainc/willow-inference-server/blob/98b31b62420bfa24a60d752e76de6dcec57aa0d3/main.py#L117

That should display much more WebRTC specific debugging information.

nikito commented 1 year ago

Gave that a shot but now get this on bootup: image

nikito commented 1 year ago

Figured that out, was a typo in the line. Needs to be level=logging.DEBUG :) image

kristiankielhofner commented 1 year ago

Thanks for catching that, we updated our logging config a while back and haven't had to use aiortc debugging in a long time. I just pushed a fix and made it a bool from settings.py.

Using WebRTC over completely local networks/IPs is possible but is kind of a strange use case (even though it makes sense give WIS). What's likely happening here is our TypeScript client is configured to use STUN by default to discover your public IP address and add it as an ICE candidate as that's been the typical usage scenario:

https://github.com/toverainc/willow-ts-client/blob/78edd1651809a522f4217a8b4606cf5da0ef5c63/src/index.ts#LL50C9-L50C9

I didn't write our TS client so I'd have to get up to speed on it to properly toggle/configure STUN support. You can clone, hack it up, and build but other than that I'll have to check with @richardklafter to see how we should be go about handling this.

nikito commented 1 year ago

So I dug into this and managed to get it working! After doing some reading up on webrtc and ICE and how Stun comes into play, I went into my router and created a NAT Hairpin rule to remap any requests to the UDP ports at the WAN interface back to the local IP of the willow inference server instance. Now it works perfectly. :)

kristiankielhofner commented 1 year ago

Excellent, great workaround! I'm going to leave this issue open until I have time to document this.

nikito commented 1 year ago

Just an update, one caveat I missed when I did the above is that I actually exposed my instance to the internet (NAT hairpin is just noticing that I am making a request to my public IP, and does the port forwarding locally instead of trying to resolve the actual external IP and route that way). As such I have since removed the rule as I was starting to see some port scan activity against that server 🤣

kristiankielhofner commented 1 year ago

You can certainly configure your firewall however you like but a couple FYIs:

Dynamically negotiated media ports have been around for ages in the VoIP, WebRTC, etc world and standard practice is to open the entire range of the configured ports for the negotiated media stream. We also randomize the port/socket assignment so it's not (or less if your range is small) predictable. For production applications we recommend using iptables port forwarding instead of the default userspace Docker proxy:

cat /etc/docker/daemon.json

{
    "userland-proxy": false,
    "iptables": true,
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
    "default-runtime": "nvidia"
}