Sparkle clients polling frequently

herf commented 8 years ago

We use Sparkle to update f.lux (thanks!) Our appcast is hosted at https://justgetflux.com/mac/macflux.xml - only SSL.

A minority of clients (perhaps one in 100,000) seem to poll in a loop. So everyday we have a bunch of IPs we have to drop at the firewall, because otherwise they send 10 requests/sec all the time. (Our nginx server also has all the abuse controls dialed now.)

We are using Sparkle 1.14.0, and the issue happened on 0.6.1 also (upgraded before reporting). Latest versions of Mac OS affected (including 10.11.5 and 10.12 in my logs today). This issue has appeared in the last year or two.

Is there anything that would allow Sparkle to poll (or retry) at a rate like this?

Normally this could be spammers, but in a few cases, the logs suggest these are normal users (who've been checking for updates once a day for a long time), and all of a sudden they start the rapid-polling behavior.

zorgiepoo commented 8 years ago

Hey,

The main thing I can think of is our background scheduling code (in SUUpdater.m):

    // How long has it been since last we checked for an update?
    NSDate *lastCheckDate = [self lastUpdateCheckDate];
    if (!lastCheckDate) { lastCheckDate = [NSDate distantPast]; }
    NSTimeInterval intervalSinceCheck = [[NSDate date] timeIntervalSinceDate:lastCheckDate];

    // Now we want to figure out how long until we check again.
    NSTimeInterval delayUntilCheck, updateCheckInterval = [self updateCheckInterval];
    if (updateCheckInterval < SUMinimumUpdateCheckInterval)
        updateCheckInterval = SUMinimumUpdateCheckInterval;
    if (intervalSinceCheck < updateCheckInterval)
        delayUntilCheck = (updateCheckInterval - intervalSinceCheck); // It hasn't been long enough.
    else
        delayUntilCheck = 0; // We're overdue! Run one now.
    self.checkTimer = [NSTimer scheduledTimerWithTimeInterval:delayUntilCheck target:self selector:@selector(checkForUpdatesInBackground) userInfo:nil repeats:NO];

But it's not quite obvious to me how it can come from here. The update check interval is constrained by SUMinimumUpdateCheckInterval which means the info.plist key or user default read there can't be the reason. The last update check date is read from the user defaults and set before poking at the appcast.

We also make use of the reachability API to test if the feed's hostname is available before making a connection. After taking a second look at it, I'm thinking that code should be removed (#836) but I can't say for sure if that's a culprit. Were they requests to your feed that were occurring often?

Lastly we make use of NSURLDownload & NSURLRequest for downloading the appcast feed and update. A download for the update is only re-tried by us if the first download was a delta update and it failed to apply it, so the full archive will be attempted.

Hope any of this may help..

herf commented 8 years ago

Yes, it's only making requests for the appcast (and the server is sending the full XML "200" response - no obvious failure in SSL negotiation or anything).

I tried to mess it up, and none of these things failed:

writing a date from 2015 in the last update pref
writing a date from 2017 in the last update pref
writing junk text in the last update pref

I guess if the pref write failed somehow that would be another case? Uh, could test user switching + filevault or something crazy like that.

Reachability seems like a much better theory.

zorgiepoo commented 8 years ago

So the reachability code will be removed in the next version (since we removed it in master), although I'm not too confident that is the cause.

kornelski commented 7 years ago

It's been a while. Please test with the latest version and reopen if it's still an issue.

jkbullard commented 7 years ago

@herf, did you ever find out what caused this problem, or resolve it some other way?

I have a similar problem in my application that has a heavily patched version of Sparkle 1.15. Starting a couple of weeks ago, approximately 1 per 100,000 instances of my application has started checking for updates 10-20 times per second. Why now is a mystery -- the version of Sparkle in my app (Tunnelblick) hasn't changed in months?

In my case, this could be some kind of (tiny) manual DOS attack, but such an attack hardly seems worthwhile. When I block one IP address that's sending lots of requests, within a few minutes or hours a different IP address starts doing the same thing, fetching the appcast. All of the IP addresses resolve back to verizon.net, or to rr.com, although one address can't be resolved to a domain. I realize that these IP addresses may be aggregating traffic for many machines that have private IP addresses, but that doesn't explain the sudden explosion in fetches (from 150/day to 500,000/day from one IP address, for example).

But if it isn't an attacker, one way I can see that the code could be failing is if the defaults data wasn't getting written, or the new value was being ignored. Then lastUpdateCheckDate would not get updated, and would always be in the past, and so the check would always be scheduled immediately.

I suppose that could happen if the preferences file was corrupt, but may application checks that it is OK at launch before trying to use NSUserDefaults, so that seems unlikely.

Alternatively, the NSUserDefault setObject:forKey method that's used to set lastUpdateCheckDate (in SUHost.m's setObject:forUserDefaultsKey: method) doesn't return an error indication, but the writeup for it includes the following comment:

Setting a default has no effect on the value returned by the objectForKey: method if the same key exists in a domain that precedes the application domain in the search list.

I don't understand that comment enough to know how that could happen, but maybe that's the problem. That situation could be tested for by getting the value back after setting it in setObject:forUserDefaultsKey:.

(I wasn't sure if I should reopen this Issue, so I didn't.)

herf commented 7 years ago

It is has not resolved for me (I am still on 1.16). I automated the firewall drop.

Trying to understand those docs - perhaps if someone was running Sparkle in the global domain, then my (non-global) one could have some failures? Like removing a key might fail?

Also it could be that someone is starting an app with "-default" and then there is apparently an "argument domain", which I didn't know about:

https://developer.apple.com/documentation/foundation/userdefaults/1410665-argumentdomain

jkbullard commented 7 years ago

I just saw #1061 and have made a comment there since that Issue is still open. I'll stop posting to this old Issue.

sparkle-project / Sparkle

Sparkle clients polling frequently #835