geckosio / geckos.io

🦎 Real-time client/server communication over UDP using WebRTC and Node.js http://geckos.io
BSD 3-Clause "New" or "Revised" License
1.34k stars 84 forks source link

Datachannel connection failing when server running in docker #245

Open arthuro555 opened 1 year ago

arthuro555 commented 1 year ago

I tried out running my app in docker, and while the signaling server works fine (a log from the authorization method is printed), the data channel seems to never open (log in onConnection never printed).

I've been careful to only use one port and publish it in docker

Port configuration My Dockerfile: ```dockerfile FROM node:16-slim AS build WORKDIR /usr/src/app COPY . /usr/src/app RUN yarn RUN yarn build FROM gcr.io/distroless/nodejs:16 COPY --from=build /usr/src/app/dist/index.js /usr/src/app/index.mjs COPY --from=build /usr/src/app/node_modules /usr/src/app/node_modules WORKDIR /usr/src/app CMD ["index.mjs"] EXPOSE 6969 EXPOSE 9696/udp ``` My geckos port range: ```js portRange: { min: 9696, max: 9696 }, ``` My command to start the container: `docker run --rm -p 6969:6969 -p 9696:9696/udp arthuro5555/thnk-relay`

Running it with node directly works. I am guessing this must be a misunderstanding/misconfiguration on my side, but I report this as a bug just in case.

I am getting this issue both on my windows machine with docker desktop as on my linux server (running through k3s)

yandeu commented 1 year ago

I guess you are missing to expose the port of the nodejs server?

arthuro555 commented 1 year ago

Is there something obviously wrong with the configuration I posted (in the spoiler)?

yandeu commented 1 year ago

Yes, you have to expose the tcp port of the singnaling server. Default is 9208.

arthuro555 commented 1 year ago

I used server.listen(6969), sorry for forgetting to mention that. Signaling works without issues, it's when connecting the data channel that the connection seems to be blocked

yandeu commented 1 year ago

Maybe you can't just copy the files from node:16-slim to gcr.io/distroless/nodejs as the binaries are uncompatible?

Try the same image for both steps, maybe 16-bullseye?

arthuro555 commented 1 year ago

Will try πŸ‘

arthuro555 commented 1 year ago

I tried, it didn't change anything :/ I doubled checked and docker correctly binds to the UDP port too.

arthuro555 commented 1 year ago

I tried disabling multiplexing and adding STUN/TURN servers, to no avail.

yandeu commented 1 year ago

I have just dockerized the simple-chat-app. Maybe it helps: https://github.com/geckosio/simple-chat-app-example/blob/master/Dockerfile

yandeu commented 1 year ago

Actually, it is getting the same error 😡

yandeu commented 1 year ago

I'm using Windows and WSL, maybe if we deploy it, it will work?

yandeu commented 1 year ago

Yes, I made the simple-chat-app-example work with these steps:

server:

local:

arthuro555 commented 1 year ago

Hmm i had tried a deploying it via kubernetes on my Linux vps and it had the same problem

yandeu commented 1 year ago

Do you have the entire 1025-65535/udp range open?

arthuro555 commented 1 year ago

No, just like when running with docker locally, I specified a single port in the server port range and only opened that one. Afaik you cannot open a range in kubernetes πŸ€” although I am a novice in kubernetes and might be wrong

yandeu commented 1 year ago

You don't have to open all these ports in k8s, but on the server k8s in running on. I believe.

My docker example exposes only 3000/tcp and 10000-10007/udp. But to successfully make a connection (eg with the iceServers) I had to open all udp ports from 1025-65535.

arthuro555 commented 1 year ago

I don't have any firewall installed on the server (it's not a production server), and it's a managed VPS, so I'd assume all ports are correctly exposed by default

yandeu commented 1 year ago

Unfortunately, I'm out of ideas :/

reececomo commented 1 year ago

You might want to check out some of the comments about docker and node-datachannel here β†’ https://github.com/murat-dogan/node-datachannel/issues/53

After some digging, the author responded:

The problem is about the docker. I guess using host network mode could solve it. Could you please try it? https://docs.docker.com/network/host/

and commenter closed the issue with:

I think the IP replacement or the docker parameter to use host network can fix that for now. But it looks like this problem occurs with node-webrtc too. And there is no problem reaching the stun server at all, the ports used to negotiate are in the interval mapped... I believe this is not something specific of this project. Thanks for the help!

I suspect this is not at all an issue with geckos.io/node-datachannel, but if anyone does figure out the best solution it might be good to add it to the docs for the next person πŸ™Œ

arthuro555 commented 1 year ago

At this point I must admit I underestimated how hard it'd be to make geckos work in docker πŸ˜… It not Just Workingβ„’ out-of-the-box is sadly a deal breaker for me, as I want users of my game to be able to just launch the server's docker container and be done with it... There may be a way to make it work but I am not competent enough in Docker and WebRTC to reach that point, so I'll just downgrade to using WebSockets πŸ˜“

reececomo commented 1 year ago

Have you checked out Stunner ecosystem for deploying WebRTC apps with k8s?

Edit: This is not 100% relevant to the above issue, just thought it was worth flagging.

reececomo commented 1 year ago

Update: This is a bizarre issue, I am now able to reproduce the issue locally, but only on some platforms:

reececomo commented 1 year ago

Update, after updating Docker for Mac on my M1 from 4.1 to 4.16 it is now not working on any platform. Points for consistency, I guess?

Before (working)

image 2023-01-28 at 1 29 23 pm

After (not working anymore)

image 2023-01-28 at 1 43 56 pm
FostUK commented 1 year ago

Note re "using network host mode" to resolve this (i.e: running with docker run -d --net=host) this mode only works on linux and not Mac/Win.

Kubernetes host netwrok mode might be worth trying - I think that works on all platforms unlike the docker one.

reececomo commented 1 year ago

Just in case anyone missed it, reverting to Docker Desktop 4.1.1 (MacOS) allowed us to get local dev working again. Tested with Chrome, macOS Big Sur, Docker Desktop 4.1.1, Intel-based MacBook Pro.

docker run -d -it \
  -p 9208:9208/tcp -p 9209:9209/udp \
  --rm monohide nc -l -u -p 9209

(Multiplexing on 9209)

dpcartwright commented 1 year ago

I'm just wondering if there is any update on this.

Oddly I had geckosio working fine with a limited port range around July last year but when revisiting the project I'm running into the same issues as above when deploying as a docker container.

reececomo commented 1 year ago

@dpcartwright just checking:

dpcartwright commented 1 year ago

@reececomo

OS: Ubuntu 22.04 LTS / Server Docker Desktop: No (container is built via GitLab runner and pushed to server) Docker Version: This has almost certainly changed. It's a new server setup with a different arch, OS and the latest versions for pretty much everything (I realise this introduces a ridiculous amount of variables)

At a guess I've gone from Docker version 20.x > 23.x

So far I've tried:

  1. Reverting the various security updates for node modules to see if there was a breaking change.
  2. Switched container to host network
  3. Opened UDP range 1025-65535 on host

None have made any difference. I suspect you might be on to something regarding the Docker version but obviously I'm not keen on permanently reverting to an old version.

I think for now I've spent far too much time on what was meant to be a quick resurrection of an old project where I learned how to use phaser and multiplayer :)

I really appreciate all the work Yandeu has put into these projects though, so I hope a fix does appear at some point.

reececomo commented 1 year ago

If you're on host networking mode already I'd suggest it's probably something at a higher layer β†’ are other UDP applications working? Do you have ICE/STUN set up? Is the remote client you're connecting from behind a Strict NAT network? Is the host? etc.

jamesward1 commented 1 year ago

Just to add to this, I don't believe the "docker-example" repo that's available works either, failing for the same exact reason, as far as I can tell. Fresh install of it, docker build, run, and check url: doesn't establish connection.

yandeu commented 1 year ago

The dockerized simple-chat-app-example works, but not on Windows. Not using WSL nor without.

yandeu commented 11 months ago

Just saw this video, but haven't tried it yet. https://youtu.be/OB2Rfzy5V0Y?t=368&si=X3rft3R8Vptbq8qR

arthuro555 commented 11 months ago

Oh, this would make a lot of sense

yandeu commented 11 months ago

I tried it but have the same issue as here https://github.com/microsoft/WSL/issues/10495.

wsl: Hyper-V firewall is not supported
wsl: Mirrored networking mode is not supported, falling back to NAT networking
arthuro555 commented 11 months ago

I think you need the window 11 insider build for it to work - personally, I'm still on windows 10, haven't had the time to upgrade the system and my workflows.

bananu7 commented 2 months ago

I'd just like to add to the discussion - I'm running my geckos game in k8s using https://github.com/l7mp/stunner. Works very well.

reececomo commented 2 months ago

Nice one - thanks @bananu7 will check it out.

re: kubernetes -> agones.dev was very easy to setup with geckos. Decided on ECS/Fargate in the end for cost, but for a larger project/team would recommend looking into that

Chroma72 commented 2 weeks ago

Ok, I thought I was going crazy. It's still not working August 2024. Running a stand-alone geckos server in docker container. Have all the TCP/UDP required ports onpened on the docker run line. Any attempted to connect gets an error connection refused. Any progress on this?

reececomo commented 2 weeks ago

AFAIK its not so much a geckos.io issue as a webrtc ICE resolution issue πŸ’β€β™€οΈ