QuantifiedSelfless / gulper

data ingestion
4 stars 1 forks source link

Sometimes gulper just hangs #8

Closed wannabeCitizen closed 8 years ago

wannabeCitizen commented 8 years ago

When testing scrapers, I've noticed that sometimes I get to a certain user and without any errors or anything happening gulper just hangs and never comes back. Here's an example of this:

Loading config with mode: dev
Registering process handlers:
        /sample_processor/num_characters
Listening on port: 6060
[I 160404 17:23:24 web:1946] 200 GET /api/showtime/process?showtime_id=4e329fa3-9d76-4616-a64f-2de8418b417b&passphrase=4bfa0188ebf304a98374c5f022203338 (67.190.84.47) 2983.96ms
[I 160404 17:23:24 connectionpool:758] Starting new HTTPS connection (1): api.tumblr.com
[I 160404 17:23:24 connectionpool:758] Starting new HTTPS connection (1): api.tumblr.com
[I 160404 17:23:26 connectionpool:758] Starting new HTTPS connection (1): api.tumblr.com
[I 160404 17:23:26 connectionpool:758] Starting new HTTPS connection (1): api.tumblr.com
Processing user:  42b11090-1bff-4d7b-a352-a36dc358069e
Processing user:  792ee54d-b949-4e65-a3ce-9f7db2bfeb3f
Processing user:  a4c991bd-9ceb-457f-872d-70bf615ba544
[I 160404 17:23:27 connectionpool:758] Starting new HTTPS connection (1): api.tumblr.com
[I 160404 17:23:27 connectionpool:758] Starting new HTTPS connection (1): api.tumblr.com
[I 160404 17:23:29 connectionpool:758] Starting new HTTPS connection (1): api.tumblr.com
[I 160404 17:23:30 connectionpool:758] Starting new HTTPS connection (1): api.tumblr.com
[I 160404 17:23:31 connectionpool:758] Starting new HTTPS connection (1): api.tumblr.com
[I 160404 17:23:31 connectionpool:758] Starting new HTTPS connection (1): api.tumblr.com
[I 160404 17:23:32 connectionpool:758] Starting new HTTPS connection (1): api.tumblr.com
[I 160404 17:23:33 connectionpool:758] Starting new HTTPS connection (1): api.tumblr.com
Processing user:  b9bef55d-e1c2-418b-979d-62762902ee38
mynameisfiber commented 8 years ago

steps to reproduce would be useful... I tried on both my machine on the dev machine and didn't get it hanging.

wannabeCitizen commented 8 years ago

I think we decided this was the backoff for rate limiting? I'm going to close for now, and if that turns out to be wrong when we start running larger tests, I'll reopen.