re-implement bandwidth constraint option

totaam commented 11 years ago

So we can limit ourselves to N Mbps if desired.

This may be implemented two ways:

accounting for our own bandwidth directly (since we know how many bytes we send/receive)
using x264 or vpx's bandwidth limits directly

Or a combination of the 2.

totaam commented 10 years ago

It would be nice to make this generic enough so that we can pass down the information to each encoder, but taking into account the fact that we may have many windows, each of which consuming a variable amount of bandwidth is not going to be easy!

totaam commented 7 years ago

See also #540, #401, #619 and #999

totaam commented 6 years ago

Support added in r17232.

You can see the current settings and the bandwidth budget distribution between multiple windows using:

xpra info | egrep -i "bandwidth-limit"

This is what iftop shows for a 1Mbps target and glxgears using up all the available bandwidth:

localhost.localdomain => localhost.localdomain    941Kb   826Kb   984Kb

Caveats:

this is best effort only, it does not take metadata packets into account. (or audio packets - but that could be fixed)
we hover around the target bandwidth, sometimes exceeding it, especially when new windows are shown or when windows are resized or when there are sudden bursts of activity on multiple windows
this is designed for TCP, RFB is not handled at all (#1620), UDP (#639) does re-transmits which are not accounted for (and maybe they should be - but that would be very hard to do: wrong layer..), websockets connections will add packet encapsulation (but the cost is negligible)
room for improvement, see below

TODO:

the session-info bandwidth graphs use up a humongous amount of bandwidth - this needs to be fixed: send less often, compress better, etc
add UI option to the HTML5 client's connect dialog
add UI option to the client's system tray
when distributing the bandwidth amongst the window sources, we should take into account the framerate and compression efficiency, video regions? - and use dynamic budgets so we can re-distribute the left overs? (if another window is not updating)
handle auto-refresh and video region refresh tuning
maybe pass the bandwidth budget information down to the video encoders: the speed and quality tunings do this indirectly already
verify speed and quality are tuned adequately: the speed should be reduced more quickly than the quality (better compression, frames take longer)
tune encoding selection: use rgb less and jpeg more when under severe constraints - beyond existing speed + quality tuning
interaction with av-sync - not sure we care: you can't have everything
take audio packets into account? (ie: wav uses a lot of bandwidth, opus not so much)
throttling to prevent the recurrent stuttering: when we hit the bandwidth limit quickly, the packets end up bundled together and the associated cost in the budget times out together so the next updates all go out together again..
use the data we may get from the OS to self-tune (#540)
disable with mmap

totaam commented 6 years ago

Hooked the network interface speed (when available) and disabled mmap, see #540 comment 16.

totaam commented 6 years ago

r17255: adds the UI option to the HTML5 client's connect dialog, defaults to the value we get from the browser's network information API (as per #1581#comment:3) We don't do this when bypassing the connect dialog, at least for now.

totaam commented 6 years ago

r17259: adds a bandwidth-limit menu to the system tray (under "Picture" submenu)
r17262: reduces the weight of "info-response", down from ~64KB to just 4KB (average) by requesting only specific info categories

@maxmylyn: ready for a first round of testing. So far, I have used "glxgears" for generating a high framerate, "iftop" to watch the bandwidth usage in realtime and the system tray to change the limit. I've also resized the glxgears window to generate more pixel updates - a larger window should give us a lower framerate (higher batch delay) and higher compression (lower speed and quality). To verify that we stick to our budget correctly, we should test using a strict bandwidth shaper (ie: tc) to replicate real-life network conditions. As long as the bandwidth-limit is slightly below the limit set by the shaper, the results should be identical. When capturing problematic conditions, make sure to get the full network characteristics (latency, bandwidth, etc) and xpra info output.#### 2017-10-23 17:16:55: antoine commented

Support added in r17232.

You can see the current settings and the bandwidth budget distribution between multiple windows using:

xpra info | egrep -i "bandwidth-limit"

This is what iftop shows for a 1Mbps target and glxgears using up all the available bandwidth:

localhost.localdomain => localhost.localdomain    941Kb   826Kb   984Kb

Caveats:

this is best effort only, it does not take metadata packets into account. (or audio packets - but that could be fixed)
we hover around the target bandwidth, sometimes exceeding it, especially when new windows are shown or when windows are resized or when there are sudden bursts of activity on multiple windows
this is designed for TCP, RFB is not handled at all (#1620), UDP (#639) does re-transmits which are not accounted for (and maybe they should be - but that would be very hard to do: wrong layer..), websockets connections will add packet encapsulation (but the cost is negligible)
room for improvement, see below

TODO:

the session-info bandwidth graphs use up a humongous amount of bandwidth - this needs to be fixed: send less often, compress better, etc
add UI option to the HTML5 client's connect dialog
add UI option to the client's system tray
when distributing the bandwidth amongst the window sources, we should take into account the framerate and compression efficiency, video regions? - and use dynamic budgets so we can re-distribute the left overs? (if another window is not updating)
handle auto-refresh and video region refresh tuning
maybe pass the bandwidth budget information down to the video encoders: the speed and quality tunings do this indirectly already
verify speed and quality are tuned adequately: the speed should be reduced more quickly than the quality (better compression, frames take longer)
tune encoding selection: use rgb less and jpeg more when under severe constraints - beyond existing speed + quality tuning
interaction with av-sync - not sure we care: you can't have everything
take audio packets into account? (ie: wav uses a lot of bandwidth, opus not so much)
throttling to prevent the recurrent stuttering: when we hit the bandwidth limit quickly, the packets end up bundled together and the associated cost in the budget times out together so the next updates all go out together again..
use the data we may get from the OS to self-tune (#540)
disable with mmap

totaam commented 6 years ago

2017-10-27 21:19:56: maxmylyn commented

Okay, initial testing is complete(trunk 2.X 17263 Fedora 25 server/client). It seems to work fine, at least the lower limits. I'm not sure my machine is capable of pushing huge amounts of data, so the 1/2/5 Mbps limits were all I could test.

One request - to facilitate testing, can we have a control channel or client/server side CLI flag or conf, such that I don't have to use the system tray (since GNOME has decided we don't need that). If we get a switch then I can add a very quick test run or two to the automated test box.

totaam commented 6 years ago

One request - to facilitate testing, can we have a control channel or client/server side CLI flag or conf, such that I don't have to use the system tray (since GNOME has decided we don't need that). If we get a switch then I can add a very quick test run or two to the automated test box.

the switch already exists and is called "bandwidth-limit", see https://xpra.org/manual.html
you can still use the system tray under gnome: use the Control+Shift+F1 shortcut to bring up the menu

totaam commented 6 years ago

2017-10-31 02:25:11: maxmylyn commented

Note to self:

Check with PNG/L

Double and triple check with TC bandwidth constraints

totaam commented 6 years ago

2017-11-01 22:09:20: maxmylyn commented

Alright, this was a fun one to test. For reference my server and client are both Fedora 25 running trunk 17281.

So I had to spend about half an hour sifting through random forum posts asking how to do this, and they all wanted some sort of weird multi-line tc command magic...so then I remembered we had some documentation on how to do delay and loss in #999. After perusing that I settled on a command:

tc qdisc add dev ens33 root netem rate 1mbit

Adapted from [https://serverfault.com/questions/787006/how-to-add-latency-and-bandwidth-limit-interface-using-tc] - close but not quite, and a bit complicated for our simple use-case. Anyways, I'm leaving this here for when I eventually will need to come back to this ticket.

NOTE: Be careful with that command, you can easily lose your SSH session if you're not careful.

And, I played around with 1mbps and 2mbps limits. I set the server to rate limit at 1mbps, and enabled and disabled TC both at 1mbps and 2mbps and in both cases, the bandwidth dropped for a second or so right after enabling/disabling TC (which makes sense as TC probably interrupts connections), but afterwards it settles around 1mbit +- a bit. The highest I saw it get was 1.2mbps with TC set to 2mbps and the limit set to 1mbps, but it settles pretty quickly down to 1mbps. So, I can definitively say the rate limiting is working as expected, even with network limits applied.

As for the png/L encoder - I'm not sure how to force that encoding. I tried --encodings=png/L which should force it to use that encoding, but when I do it fails to connect with:
2017-11-01 15:07:30,448 server failure: disconnected before the session could be established
2017-11-01 15:07:30,448 server requested disconnect: server error (error accepting new connection)
2017-11-01 15:07:30,468 Connection lost
I'm not entirely sure how to force the PNG/L encoding like we talked about, so I'm going to pass this to you to ask how.

totaam commented 6 years ago

... settles around 1mbit +- a bit .. Does the bandwidth-limit=1mbps work better than not setting any value when running on a 1mbps constrained connection? (in particular the perceived screen update latency, which should correlate with the batch.delay + damage.out_latency) Did you test with tc latency and jitter? Did you notice any screen update repetitive stuttering?

it fails to connect with: server failure...

The error shown in the server log was: Exception: client failed to specify any supported encodings, r17282 fixes that.

totaam commented 6 years ago

Minor cosmetic improvements in r17296 + r17297 + r17298.

totaam commented 6 years ago

r17452 adds bandwidth availability detection (server side), see #999#comment:18 for details.

totaam commented 6 years ago

2017-12-12 23:11:05: maxmylyn commented

Finally catching up to this one:

Does the bandwidth-limit=1mbps work better than not setting any value when running on a 1mbps constrained connection? (in particular the perceived screen update latency, which should correlate with the batch.delay + damage.out_latency)

Definitely. Just running glxgears without the added bandwidth limitations makes it apparent that the added bandwidth limitation helps immensely. Without setting a bandwidth limit, framerate is all over the place with lots of stutters and catching up. With the bandwidth limit set, the framerate is much smoother and notably more consistent with only a small initial stuttering.

Did you notice any screen update repetitive stuttering?

I already mentioned this above, and yes, but only when on a severely constrained connection without the limit set (--bandwidth-limit=).

Did you test with tc latency and jitter?

I'll do this shortly....right after my ~3pmish espresso.

totaam commented 6 years ago

2017-12-12 23:41:59: maxmylyn commented

Alright I ran a few levels of TC:

"Light TC" aka 50ms +-10ms with a 25% chance delay 50ms 10ms 25%:

Some stuttering - framerate not quite as high as only bandwidth limits, but still half decent

"Light TC only loss" aka 2% loss no jitter loss 2%:

Lots of stuttering - but higher framerate when not stuttering. Unfortunately it stutters a lot more than it is holding a steady framerate.

"Medium TC only loss" aka 2% loss some jitter loss 2% 25%:

Not much worse than the light TC with only loss - but framerate was notably lower even when it wasn't stuttering

Just to be thorough I threw a combination of loss and delay loss 2% delay 50ms 10ms but it wasn't pretty - very low framerate, with the occasional burst of a bit more framerate.

As a total aside - I wonder if there's some utility that will give some kind of packet type accounting in aggregate - to see how much of an impact of the TCP packets have with needing to resend data. Mostly out of curiosity.

totaam commented 6 years ago

The stuttering with packet loss is caused by the packets backing up whilst waiting for the server's TCP resend, then they're all flowing again at the same time. UDP (#639) could deal with this better, by skipping those frames. (not sure the current heuristics skip frames as often as they should) We could also use some sort of throttled vsync to keep the framerate more constant when recovering, but that would be hard work and nothing is going to allow us to avoid the pause with TCP as this is happening in the host network stack. I think this works well enough to close this ticket, we can open new ones with refinements / new requirements.

totaam commented 6 years ago

Not sure how well this got tested: although the original changeset (r17259) was fine, r17296 introduced a regression which causes the connection to abort when the system tray bandwidth limit is changed... Fixed in r18141.

Xpra-org / xpra