mozilla-services / autopush

Python Web Push Server used by Mozilla
https://autopush.readthedocs.io/
Mozilla Public License 2.0
218 stars 30 forks source link

Send a special WebSocket close code for clients that ping too frequently #103

Closed ghost closed 9 years ago

ghost commented 9 years ago

For context: https://bugzilla.mozilla.org/show_bug.cgi?id=1152264

In #78, we introduced an adaptive response delay for clients that ping too frequently. Unfortunately, folks are still reporting high battery and data usage, even with the fix in place. This is caused by a bug in the client's adaptive ping logic, and affects all FxOS 2.x releases (1.x is unaffected because it didn't ship with adaptive pings).

Any client can potentially enter this state (especially those on unreliable networks), and there's no recovery apart from manually resetting the prefs. The client patch is in place, but has not yet been uplifted.

On our end, we can detect when clients enter a ping loop, and send a special WebSocket close code (4774). This is normally used for UDP wake-up: if the client detects this code, it won't reconnect, as it expects the server to wake it up for incoming notifications.

The trade-off is that phones on non-TEF networks won't receive any push notifications until their network status changes—either they lose reception and reconnect, or their phone switches between cellular and Wi-Fi. (TEF has their own UDP wake-up platform, so we can actually make this work for them). But it's a small price to pay for battery life and reasonable data usage.

A vague plan:

bbangert commented 9 years ago

Only thing I'd change, don't drop adaptive response delay, since that may still help some clients. But do check the user-agent, and send that special close code to known buggy clients.

ghost commented 9 years ago

Only thing I'd change, don't drop adaptive response delay, since that may still help some clients.

:+1:

But do check the user-agent, and send that special close code to known buggy clients.

Do you mean the device ID, or User-Agent header?

bbangert commented 9 years ago

The User-Agent header, afaik we know the possible values for FxOS to only send them this close code.

ghost commented 9 years ago

Sounds good. It's a bit unnerving that "the possible values for FxOS to only send them this close code" means "everything released in the past year, including the current one." :wink: But looks like the client patch has been uplifted. Now, how many phones will receive that update...

bbangert commented 9 years ago

Well yea..... there is that. :grimacing: But I mean to avoid sending it to new unreleased clients.... or desktop, TV, etc. until we know they need it.

ghost commented 9 years ago

Remove the adaptive response delay, and disconnect clients that ping too frequently. [...] I think this calls for a weighted moving average, to minimize the impact on well-behaved clients that happen to be on spotty networks.

Actually, this is silly. I don't know why I suggested reinstating the disconnect, or storing the connection lifespan, since the problem is specifically with pings. Let's just keep the connection open, and send the close code if we detect a consistently low ping interval.