geckosio / geckos.io

🦎 Real-time client/server communication over UDP using WebRTC and Node.js http://geckos.io
BSD 3-Clause "New" or "Revised" License
1.34k stars 84 forks source link

Expose multiplexing option from node-datachannel #243

Closed arthuro555 closed 1 year ago

arthuro555 commented 1 year ago

Closes #228

My understanding is that, with this option enabled, a single UDP port can be used by multiple connections at once, whereas without it, there is a 1 connection <-> 1 UDP port relationship. Although, while creating an e2e test (6e3eaebf4c5f57b237201ce14d646496c416e043), it seems that even without multiplexing and a port range of two ports, more than two connections can be created successfully, so my understanding seems to be incorrect. I failed to find more documentation on the subject, my only guess with my incomplete knowledge is that the browser behind puppeteer negotiates through ICE the reuse of the same connection?

I chose to set multiplex to true by default, since my understanding is that this should be a non-breaking change that most people will want - the port range is still respected, it's just that some of those ports may be used for multiple connections at once, eliminating faults due to running out of ports. The only case I can think of where this might break something is if someone built an application with the assumption that there never will be more connections than UDP ports, but as I established just before, that already seems not to be the case and such an application would therefore already be broken...

yandeu commented 1 year ago

I'm also not very sure how it works. I don't know if it is the best to use one single port.

Maybe a combination of multiplex and port range is the best? For example split all (maybe 100+ connections) over 10 ports and use each port for 10+ connections?

I have no idea. Maybe @paullouisageneau knows more?

paullouisageneau commented 1 year ago

My understanding is that, with this option enabled, a single UDP port can be used by multiple connections at once, whereas without it, there is a 1 connection <-> 1 UDP port relationship.

Your understanding is correct.

Although, while creating an e2e test (6e3eaeb), it seems that even without multiplexing and a port range of two ports, more than two connections can be created successfully, so my understanding seems to be incorrect. I failed to find more documentation on the subject, my only guess with my incomplete knowledge is that the browser behind puppeteer negotiates through ICE the reuse of the same connection?

Without ICE multiplexing each peer connection effectively binds a UDP socket to a different port, therefore it should fail when the range is full.

I tried running the tests on my machine and the failure seems actually caused by the server of the "should work with multiplex" test still running on port 6969 when the "should NOT work without multiplex" test attempts to run a server on the same port, so the server does not run and no connection succeeds:

● Multiplexing (using a single UDP port) › should NOT work without multiplex

    listen EADDRINUSE: address already in use :::6969

      55 |
      56 |     io.addServer(server)
    > 57 |     server.listen(6969)
         |            ^
      58 |
      59 |     const firstPage = await browser.newPage()
      60 |     const secondPage = await browser.newPage()

      at Object.listen (test/e2e/multiplexing.test.js:57:12)

  ● Multiplexing (using a single UDP port) › should NOT work without multiplex

    expect(received).toBe(expected) // Object.is equality

    Expected: 2
    Received: 0

      71 |     thirdPage.goto('http://localhost:6969/e2e/multiplexing.html')
      72 |     await pause(2000)
    > 73 |     expect(connections.size).toBe(2)
         |                              ^
      74 |     server.close()
      75 |   })
      76 | })

      at Object.toBe (test/e2e/multiplexing.test.js:73:30)

If I remove the first test, the second seems to pass successfully:

 PASS  test/e2e/multiplexing.test.js
  Multiplexing (using a single UDP port)
    ✓ should NOT work without multiplex (2067 ms)

Maybe a combination of multiplex and port range is the best? For example split all (maybe 100+ connections) over 10 ports and use each port for 10+ connections?

This is currently not possible as the implementation in libjuice supports only a single multiplex of connections: if there are already multiplexed connections open when creating a multiplexed connection, the new connection is attached to the same port as the existing connections and its port range is ignored. Port ranges for non-multiplexed connections created in parallel are applied normally though.

paullouisageneau commented 1 year ago

I chose to set multiplex to true by default, since my understanding is that this should be a non-breaking change that most people will want - the port range is still respected, it's just that some of those ports may be used for multiple connections at once, eliminating faults due to running out of ports.

There is actually a drawback with the multiplexed mode: two peers using the multiplexed mode can't open more than one connection between them. Here, this is not an issue, as the other peers are always browsers, so I think enabling multiplexing by default is fine.

arthuro555 commented 1 year ago

I tried running the tests on my machine and the failure seems actually caused by the server of the "should work with multiplex" test still running on port 6969 when the "should NOT work without multiplex" test attempts to run a server on the same port, so the server does not run and no connection succeeds:

My bad :p I reworked the test to avoid conflicts and teardown everything properly. Although, even after that, and even if I remove the first test entirely, the second one fails for me :(

There is actually a drawback with the multiplexed mode: two peers using the multiplexed mode can't open more than one connection between them. Here, this is not an issue, as the other peers are always browsers, so I think enabling multiplexing by default is fine.

Thank you for this information! :D I added to the JSDoc comment Enabling this option has a drawback: a peer will not be able to establish more than one multiplexed connection to the server at once. - Just to be on the safe side, is that accurate or have I misunderstood your statement?

paullouisageneau commented 1 year ago

My bad :p I reworked the test to avoid conflicts and teardown everything properly. Although, even after that, and even if I remove the first test entirely, the second one fails for me :(

It looks like there are a couple unrelated mistakes causing the test to fail, please see my comments.

Thank you for this information! :D I added to the JSDoc comment Enabling this option has a drawback: a peer will not be able to establish more than one multiplexed connection to the server at once. - Just to be on the safe side, is that accurate or have I misunderstood your statement?

Sorry, I wasn't clear: the limitation to one connection appears only when both sides of the connection use node-datachannel with multiplexing enabled. If I'm not mistaken, geckos.io opens connection only between the server and browsers, so you are good, such limitation does not even exist.

yandeu commented 1 year ago

I guess I can now safely release release v2.3.0?

paullouisageneau commented 1 year ago

I guess I can now safely release release v2.3.0?

Looks good :+1:

You might want to remove Enabling this option has a drawback: a peer will not be able to establish more than one multiplexed connection to the server at once. from the comment as it's not the case here.

Quitalizner commented 1 year ago

Maybe a combination of multiplex and port range is the best? For example split all (maybe 100+ connections) over 10 ports and use each port for 10+ connections?

How many connections can a port handled when multiplexing is enabled? How do you go about splitting the connections over ports? Is it handled automatically by node-datachannel?

arthuro555 commented 1 year ago

How do you go about splitting the connections over ports? Is it handled automatically by node-datachannel?

Later messages revbealed that this was a wrongful assumption - after finding the first free port in the port range, node-datachannel will reuse this port for all connections until the end of the process. If you want to be safe, you could maybe run the same amount of node processes as of ports in the range, and load balance across the ports of the port range using a reverse proxy, e.g. NGINX, although that would obviously add a lot of complexity to your project, especially if the server is stateful.

Quitalizner commented 1 year ago

Yeah, if the state has to be shared then it would require adding an in-memory store to share across all instances. I'm currently running the server with all the ports exposed without the multiplexing involved, works well so far. Though haven't tested thoroughly

paullouisageneau commented 1 year ago

How many connections can a port handled when multiplexing is enabled? How do you go about splitting the connections over ports? Is it handled automatically by node-datachannel?

There is virtually no limitation on how many connections can be handled on a single port when multiplexing is enabled, so load-balancing on multiple ports does not really make sense. You can reduce the range to a single port without any drawback with multiplexing, provided the port is guaranteed to be available.