Closed TravisBowers closed 2 years ago
Hmm, this issue is really confusing.
Are the web browsers you're using to connect to the demo running on the same machine as the instance?
If not, what's the network layout like? I may need to add a CoTurn container to the demo to allow for it to work in more complex network topologies.
Also, what command are you using to launch the demo? I've tested both with docker-compose up
as well as docker compose up
. Both worked fine, but I did notice that they return different version numbers.
The versions of the dependencies are below:
Docker version: 20.10.14, build a224086
docker-compose
version: 1.27.4
docker compose
version: 2.3.3
Nvidia Driver Version: 510.60.02
CUDA Version: 11.6
OS: Ubuntu 21.10
As another data point, I can report that a clients running google chrome version (Fedora) 91.0.4472 and 101.0.4951.54 (Windows) are also getting stuck on "Starting connection to server, please wait"
I've had identical results with docker-compose up
and docker compose up
.
The demo is running on a headless system on the same subnet and vlan as the web browser clients that I have tested with. Among the client systems that I've tested with, the android browser has been the only one to succesffully connect to the pixelstreaming application.
buccaneer_demo_chrome_console.log
This is the chrome console log from a failed connection attempt. These lines seem strange to me. 172.19.0.5
is not a local IP that exists anywhere on my test network.
webRtcPlayer.js:459 ICE candidate: {sdpMid: '0', sdpMLineIndex: 0, candidate: 'candidate:2105326672 1 udp 2122260223 172.19.0.5 39834 typ host generation 0 ufrag 23vP network-id 1'}
webRtcPlayer.js:462 ICE candidate successfully added
webRtcPlayer.js:172 ice connection state change: Event {isTrusted: true, type: 'iceconnectionstatechange', target: RTCPeerConnection, currentTarget: RTCPeerConnection, eventPhase: 2, …}
app.js:1856 <- SS: {"type":"iceCandidate","candidate":{"sdpMid":"0","sdpMLineIndex":0,"candidate":"candidate:872366240 1 tcp 1518280447 172.19.0.5 50605 typ host tcptype passive generation 0 ufrag 23vP network-id 1"}}
webRtcPlayer.js:459 ICE candidate: {sdpMid: '0', sdpMLineIndex: 0, candidate: 'candidate:872366240 1 tcp 1518280447 172.19.0.5 50…type passive generation 0 ufrag 23vP network-id 1'}
webRtcPlayer.js:462 ICE candidate successfully added
I think I can confirm that there isn't anything wrong with the containers themselves. I'm able to start them manually with host networking and connect to the pixelstream with:
docker run --rm --name cirrus -d --network=host tensorworks/buccaneerdemo-cirrus
docker run --rm --name unreal -d --gpus all --network=host --entrypoint "sleep" tensorworks/buccaneerdemo-application infinity
docker exec -it unreal /home/ue4/project/TP4.sh -StatsEmitterURL=http://stats:8000 -EventEmitterURL=http://events:8080 -RenderOffScreen -Unattended -PixelStreamingURL=ws://127.0.0.1:8888
The problem most likely has something to do with the way the compose is setting up networking. When running the compose, this is the network configuration on my container host.
docker network ls
NETWORK ID NAME DRIVER SCOPE
328d0567595c bridge bridge local
20f9efb166fc compose_buccaneer bridge local
c6ada529bd84 host host local
e690d0c4cd4f none null local
ip -br a
lo UNKNOWN 127.0.0.1/8 ::1/128
enp1s0 UP 172.16.0.114/24 fe80::5054:ff:fe5d:b471/64
docker0 DOWN 172.17.0.1/16
br-20f9efb166fc UP 172.18.0.1/16 fe80::42:fdff:fef8:1c23/64
vethe45cdfb@if86 UP fe80::a85b:12ff:fe69:f7be/64
veth85c4f05@if88 UP fe80::7cd0:1aff:fe55:a8ce/64
vethd6dafac@if90 UP fe80::483f:feff:febf:eb7a/64
vethcea33a6@if92 UP fe80::888b:b0ff:fee4:decf/64
veth3c9ce83@if94 UP fe80::682e:d4ff:fe36:e075/64
veth3fa47f0@if96 UP fe80::c8d2:5bff:feb6:7ecf/64
veth7bbc544@if98 UP fe80::6c71:ecff:fe1a:d44e/64
vethf3d6cdb@if100 UP fe80::50e1:dcff:fe23:3ee6/64
ip route
default via 172.16.0.1 dev enp1s0 proto dhcp src 172.16.0.114 metric 100
172.16.0.0/24 dev enp1s0 proto kernel scope link src 172.16.0.114
172.16.0.1 dev enp1s0 proto dhcp scope link src 172.16.0.114 metric 100
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
172.18.0.0/16 dev br-20f9efb166fc proto kernel scope link src 172.18.0.1
My pixelstream desktop client ip is 172.16.0.162/24
@TravisBowers I've updated the Docker Compose demo to use host networking mode in commit a8df164, since it sounds like the issues you're encountering may be related to the networking setup. Could you please give the new demo a try and see if that resolves the problems?
Thanks @adamrehn, the change to host networking seems to have broken the connection between the unreal container and the cirrus server.
unreal | [2022.06.02-14.05.41:443][580]LogPixelStreamingSS: Error: Failed to connect to SS: Could not initialize connection
unreal | [2022.06.02-14.05.41:444][580]LogPixelStreamingSS: Connecting to SS ws://cirrus:8888
Perhaps the docker bridge network was providing dns resolution for container services?
@TravisBowers ah okay, that looks like Docker Compose is still using the old version of the container image for the demo application. Could you please pull the updated version of the image and try it again:
docker pull tensorworks/buccaneerdemo-application:latest
Thank you, I am able to connect to the demo application now. I think we can consider this issue resolved; however, I am still curious about why a mobile browser was able to connect to the bridge network configuration, but a desktop browser wasn't.
After applying the suggested solution in #1, the example deployment outlined in the
docker-compose
example appears to run on a GPU container host as expected.However, several Linux desktop browsers are unable to connect successfully to the compose's demo application.
Firefox 100.0 and Chromium 100.0.4896.127
Fails to connect with: Disconnected: "Failed to set remote offer sdp: Failed to set remote video description send parameters." This is somewhat unsurprising. These browsers typically exhibit this behavior on linux.
Google Chrome (Desktop) 101.0.4951.54
The browser hangs at "Starting connection to server, please wait" indefinitely. The triangular "play" button never appears.
Typically, this browser works well as a pixelstream client. In fact, pixelstreaming works as expected after modifying the docker compose to deploy a different pixelstreaming application. Of course, this breaks the Buccaneer integration because the alternate app does not support Buccaneer yet.
Compose logs when the Chrome browser attempts to connect:
Google Chrome (Android) 101.0.4951.61
Surprisingly, Google Chrome on Android is able to connect to the demo application without any issues. This may rule out some network and server-side issues with Docker, Nvidia drivers, etc.
Compose log when Chrome (Android) connects to the pixelstream:
Other details
Docker version: 20.10.14, build a224086 Docker Compose Version: 1.25.0 Nvidia Driver version: 510.60.02 CUDA Version: 11.6 OS Version: Ubuntu 20.04.4 LTS
It's entirely possible that is issue isn't a Buccaneer issue at all. After all, Unreal Engine Metrics dashboard provided by the grafana container is plotting realtime data. The pixelstreaming session for the demo app could be breaking down at a different layer, but it's unclear why this is happening.