I have a cron job that pulls stats from NS1 every minute. Right now I don't have any locking preventing more than one copy of this cron from running. Instances of this cron have started piling up.
$ ps -ef | grep 'dns_graphite.py' | wc -l
183
If I check what they are doing, I find them stuck reading from a socket.
bpitts@prod-iman-www3:~$ sudo strace -p 859
Process 859 attached - interrupt to quit
read(3, ^C <unfinished ...>
Process 859 detached
bpitts@prod-iman-www3:~$ sudo ls -l /proc/859/fd
total 0
lr-x------ 1 root root 64 Mar 16 07:52 0 -> pipe:[741839769]
lrwx------ 1 root root 64 Mar 16 07:52 1 -> /tmp/tmpf6sjXC6 (deleted)
lrwx------ 1 root root 64 Mar 16 07:52 2 -> /tmp/tmpf6sjXC6 (deleted)
lrwx------ 1 root root 64 Mar 16 07:52 3 -> socket:[741845547]
lr-x------ 1 root root 64 Mar 16 07:52 8 -> pipe:[741839777]
l-wx------ 1 root root 64 Mar 16 07:52 9 -> pipe:[741839777]
bpitts@prod-iman-www3:~$ sudo lsof | grep 741845547
python 859 root 3u IPv4 741845547 0t0 TCP prod-iman-www3.aws-us-east-1.evbops.com:31146->104.20.86.93:https (ESTABLISHED)
The other end of that socket is NS1.
coconut:~ bpitts$ host api.nsone.net
api.nsone.net has address 104.20.85.93
api.nsone.net has address 104.20.86.93
I have a cron job that pulls stats from NS1 every minute. Right now I don't have any locking preventing more than one copy of this cron from running. Instances of this cron have started piling up.
If I check what they are doing, I find them stuck reading from a socket.
The other end of that socket is NS1.
Regardless of the status of those servers, requests to them should not wait indefinitely. Currently they do because you do not set any timeouts: https://github.com/ns1/nsone-python/search?utf8=%E2%9C%93&q=timeout&type=Code
I think all your transports support timeouts, you just need to set them; e.g. http://docs.python-requests.org/en/latest/user/advanced/#timeouts