dylex / slack-libpurple

Slack module for libpurple
GNU General Public License v2.0
280 stars 38 forks source link

Rate limited during paged requests #27

Open JustinHop opened 6 years ago

JustinHop commented 6 years ago

Thanks a bunch for your help!

It looks like the paging of requests was working, then I the next request got ratelimited. Here is the debug output starting after it looped through getting paged user info successfully a few times.

Also, here is some info on ratelimits for slack api https://api.slack.com/docs/rate-limits

Thanks again

(23:39:30) slack: api call: https://slack.com/api/users.list?token=REMOVED&presence=false&limit=100&cursor=REMOVED (23:39:30) util: requesting to fetch a URL (23:39:30) dnsquery: Performing DNS lookup for slack.com (23:39:30) dns: Successfully sent DNS request to child 22416 (23:39:30) dns: Got response for 'slack.com' (23:39:30) dnsquery: IP resolved for slack.com (23:39:30) proxy: Attempting connection to 13.33.224.10 (23:39:30) proxy: Connecting to slack.com:443 with no proxy (23:39:30) proxy: Connection in progress (23:39:30) proxy: Connecting to slack.com:443. (23:39:30) proxy: Connected to slack.com:443. (23:39:30) nss: SSL version 3.3 using 128-bit AES-GCM with 128-bit AEAD MAC Server Auth: 2048-bit RSA, Key Exchange: 256-bit ECDHE, Compression: NULL Cipher Suite Name: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (23:39:30) nss: subject=CN=slack.com,O="Slack Technologies, Inc.",L=San Francisco,ST=CA,C=US issuer=CN=DigiCert SHA2 Secure Server CA,O=DigiCert Inc,C=US (23:39:30) nss: subject=CN=DigiCert SHA2 Secure Server CA,O=DigiCert Inc,C=US issuer=CN=DigiCert Global Root CA,OU=www.digicert.com,O=DigiCert Inc,C=US (23:39:30) nss: subject=CN=DigiCert Global Root CA,OU=www.digicert.com,O=DigiCert Inc,C=US issuer=CN=DigiCert Global Root CA,OU=www.digicert.com,O=DigiCert Inc,C=US (23:39:30) certificate/x509/tls_cached: Starting verify for slack.com (23:39:30) certificate/x509/tls_cached: Checking for cached cert... (23:39:30) certificate/x509/tls_cached: ...Found cached cert (23:39:30) nss/x509: Loading certificate from /home/justin/.purple/certificates/x509/tls_peers/slack.com (23:39:30) certificate/x509/tls_cached: Peer cert matched cached (23:39:30) nss/x509: Exporting certificate to /home/justin/.purple/certificates/x509/tls_peers/slack.com (23:39:30) util: Writing file /home/justin/.purple/certificates/x509/tls_peers/slack.com (23:39:30) nss: Trusting CN=slack.com,O="Slack Technologies, Inc.",L=San Francisco,ST=CA,C=US (23:39:30) certificate: Successfully verified certificate for slack.com (23:39:30) util: request constructed (23:39:30) util: Response headers: 'HTTP/1.1 429 Too Many Requests Content-Type: application/json; charset=utf-8 Content-Length: 34 Connection: close Access-Control-Allow-Origin: * Cache-Control: private, no-cache, no-store, must-revalidate Date: Tue, 20 Feb 2018 07:39:30 GMT Expires: Mon, 26 Jul 1997 05:00:00 GMT Pragma: no-cache Referrer-Policy: no-referrer Retry-After: 7 Server: Apache Strict-Transport-Security: max-age=31536000; includeSubDomains; preload Vary: Accept-Encoding X-Content-Type-Options: nosniff X-OAuth-Scopes: read,client,identify,post,apps X-Slack-Backend: h X-Slack-Req-Id: 4f04a68f-4967-4fbc-92ca-468dec095bb4 X-XSS-Protection: 0 X-Cache: Error from cloudfront Via: 1.1 6ed623541a1487ecd1bc71b49417e87c.cloudfront.net (CloudFront) X-Amz-Cf-Id: 9bQmr5sX9mbCCiJ8ySBHhncjpDCa8UWzkte5bnC-4yl8kFyEatGHdA==

' (23:39:30) util: parsed 34 (23:39:30) slack: api response: {"ok":false,"error":"ratelimited"} (23:39:30) connection: Connection error on 0x5616fdb0e940 (reason: 0 description: ratelimited) (23:39:30) account: Disconnecting account justin.hoppensteadt@slack.com (0x5616fb8499e0) (23:39:30) connection: Disconnecting connection 0x5616fdb0e940 (23:39:30) websocket: removing input 0

dylex commented 6 years ago

This seems conceptually simpler -- just wait for the given Retry-After header -- but is unfortunately technically more work, because it means we need to parse headers (and chunking) ourselves, rather than letting libpurple do it. (Or, hackily, we could just hard-code a 10-second sleep whenever we get ratelimited.)

(I vaguely wonder if this whole approach of loading all the users upfront is wrong, and we should just load the active ims and joined channels, but we also need this list to decode mentions and members...)

dylex commented 6 years ago

I decided to do the hacky thing, at least for now. If it gets "ratelimeted" it just waits 10 seconds. See if that gets your farther, and then we can try to improve on this. There's also no message to the user, so if it gets ratelimited a lot, there may be a long delay.

JustinHop commented 6 years ago

I built and installed it. Still timing out, but it looks like there is enough code there for me to figure out the rest. I'll report back with what timeout setting works.

Thanks a bunch!

JustinHop commented 6 years ago

I'm getting consistently good results with a 30 second timeout. 20 fails sometimes and 10 fails all the time. Only problem with 30 is that it takes a really long time to sign in

and thank you. This has really improved my workflow

dylex commented 6 years ago

When you say 10 fails all the time, how does it fail exactly? I would've expected if we resent the request too soon after a ratelimit, we'd just get another ratelimited response, but perhaps it's returning a different error? If you happen to have debug logs of this, it might be useful in figuring out a more solid solution. If it's just a matter of grabbing the right Retry-After time, that should be doable at some point.

Thanks for trying things out. I'll leave this open pending a more robust solution.

JustinHop commented 6 years ago

So in the response header they give you a cooloff time, which changes but looks like the most common number they send back is 9 and I have not seen one over 20, granted I have a small sample size. With a setting of 10 seconds and a timeout header of 9, it gets a response from the slack api that it didn't wait long enough and it disconnects causing an error the same as if it did not wait at all. There may be bug on their end in regards to that, but not much we can do about that.

I'll start running it with debug logging for my daily use and report back if I get anything useful

JustinHop commented 6 years ago

I've been running with 12 seconds since my last comment and have had no issues. I'll pull and recompile.

dylex commented 6 years ago

I've also made this configurable now, so you can change without recompiling.

JustinHop commented 6 years ago

Awesome. Thanks

dylex commented 6 years ago

Just a reminder to myself that the roomlist interface also needs to be switched to the paged calls (and maybe could pre-populate from the cached lists?)

kendrak24 commented 6 years ago

(I vaguely wonder if this whole approach of loading all the users upfront is wrong, and we should just load the active ims and joined channels, but we also need this list to decode mentions and members...)

Well, we have ~2500 people in our slack, so it takes forever to connect, as expected. I know that's probably a bit of an edge case, but some form of lazy loading of users would be nice.

benklop commented 6 years ago

I've also got multiple thousands of users in slack, and at least using bitlbee it seems to time out then crash:

[10:40:29] <benklop> account slack on
[10:40:29] <root> slack - Logging in: Requesting RTM
[10:40:30] <root> slack - Logging in: Connecting to RTM
[10:40:30] <root> slack - Logging in: RTM Connected
[10:40:30] <root> slack - Logging in: Loading Users
[10:42:29] <root> slack - Login error: Connection timeout
[10:42:29] <root> slack - Logging in: Signing off..
[10:42:29] <root> slack - Logging in: Reconnecting in 5 seconds..
[10:42:34] <root> slack - Logging in: Requesting RTM
[10:42:35] <root> slack - Logging in: Connecting to RTM
[10:42:35] <root> slack - Logging in: RTM Connected
[10:42:35] <root> slack - Logging in: Loading Users
[10:44:34] <root> slack - Login error: Connection timeout
[10:44:34] <root> slack - Logging in: Signing off..
[10:44:34] <root> slack - Logging in: Reconnecting in 15 seconds..
[10:44:37] * Topic for &bitlbee is "Welcome to the control channel. Type help for help information."
[10:44:37] * Topic set by root!root@localhost on 2018-04-10 16:44:37 UTC
[10:44:37] <root> Welcome to the BitlBee gateway!
[10:44:37] <root>  
[10:44:37] <root> Running BitlBee-LIBPURPLE 3.4.2-0ubuntu1
dylex commented 6 years ago

I take it the crash is happening when trying to reconnect the second time? Or maybe when a late response comes back? I wonder what exactly is triggering the timeout. Have you tried increasing the ratelimit_delay option (though the default is already quite large). If you can get debug logs somehow it would be helpful. The crash seems like a separate issue worth investigating.