geckosio / geckos.io

🦎 Real-time client/server communication over UDP using WebRTC and Node.js http://geckos.io
BSD 3-Clause "New" or "Revised" License
1.35k stars 85 forks source link

Configurable transport layer SACKs (for node-datachannel) #247

Closed reececomo closed 9 months ago

reececomo commented 1 year ago

Summary

When sending packets @ 60Hz from Chrome, our Geckos.io server is responding at the transport layer ~40Hz, at about 107 bytes each.

Configuring delayedSackTime?

There's a comment in node-datachannel with settings to increase the SACK delay window via delayedSackTime=, see https://github.com/murat-dogan/node-datachannel/issues/34#issuecomment-819276519 - although I'm a little out of my depth here. Based on the comments in libdatachannel it looks like they've increased the default delay from 200ms to 20ms.

Example chrome://webrtc-internals/ output:

transport-snippet
reececomo commented 9 months ago

Edit: See https://github.com/geckosio/geckos.io/issues/247#issuecomment-1916540179, probably don't do this.

Turns out its as easy as setting:

import * as NodeDataChannel from "node-datachannel";

// Force SCTP settings.
NodeDataChannel.setSctpSettings({
  delayedSackTime: 1_000,
});

In this example the server is sending at about ~29Hz:

Before:

The default SACK setting from libdatachannel adds a slight transport layer overhead of ~10 additional packets per second per client (39-43 recvs per second):

Screenshot 2024-01-29 at 11 45 52 pm

After:

Setting delayedSackTime arbitrarily high appears to drop that to a negligible amount. Here is the same connection running (28-31 recvs per second):

Screenshot 2024-01-29 at 11 29 48 pm
reececomo commented 9 months ago

Edit: After hastily rolling this out to staging server I can now see, in fact, under real world conditions this absolutely wrecks net performance.

See the netgraphs below, latency variance / jitter goes to through the roof.

This could be because we're running a separate reliable datachannel on the same connection (SCTP settings apply to all channels), and that channels buffering frames block the main thread, etc.

Not going to spend too much time on this, just going to revert for now and assume libnodechannel knew what they were doing when they set their defaults 😆

jebarpg commented 9 months ago

What tool were you using for the second network stream image? @reececomo

reececomo commented 9 months ago

No tool unfortunately - it's just a lil debugging overlay we added 🙈

Stats log from the netcode wrapper into gameserverConnection.stats, and then are just dumped with PIXI.Graphics and PIXI.BitmapText into an overlay.