Closed smoser closed 10 years ago
Hmm, good spot. The answer is that I don't know, I'll try to dig into this over the weekend.
Might I ask a couple questions?
Response#iter_content
? 1.) no good reason, but its not explicitly wrong 2.) see 1. 3.) the code i was using this from expects a 'read' interface. no really good reason, but again, thats not explicitly wrong. surely Response#read is supported. I suspect it wouldn't be any different. 4.) It does, but you can just try it and see. On an ubuntu instance or kvm guest or lxc container, somewhere just 'apt-get install squid3 and then use 'http_proxy=http://localhost:3128/' 5.) i think it ends up getting one read in the second.
So overaall, yeah, I could change code to do things a bunch of different way, and as I said in the original bug open, the 2 opens is silly. But doesn't seem explicitly wrong.
Ok, so I can't reproduce this using mitmproxy
on my box. I wonder if squid doesn't like something we're doing.
Lukasa, If you're interested in debugging, I set up a saucy instance in azure. You should be able to get in with your github keys to 'smoser@smoser1029s.cloudapp.net'
There, I've done 'apt-get install python-requests squid3'. I put the example testcase above in 'issue-1686.py' and I can reproduce it there.
I'll leave that machine up till thursday or friday. If I don't hear back from you by then, I'll kill it.
Thanks for looking into it.
How big is the object? You might be triggering read-ahead or something, such that the second request blocks until the first is complete.
The open/close/open idiom is unsafe in general on HTTP as there is no guarantee the second request will go to the same server with the same content.
Also it's super slow on high latency links, so I highly recommend one open, stream, close.
Oh, also the output code is nonsafe: you should write to a temporary file and rename.
I've added some print statements:
cat t.py
BUFLEN = 1024 * 10
import requests, sys if len(sys.argv) > 3: BUFLEN = int(sys.argv[3])
print "probing" cq = requests.get(sys.argv[1], stream=True) cq.raw.read(1024) cq.close()
print "getting" req = requests.get(sys.argv[1], stream=True) with open(sys.argv[2], "wb") as wfp: while True: print "read" buf = req.raw.read(BUFLEN) print "write" wfp.write(buf) if len(buf) != BUFLEN: break req.close()
Output:
python ./t.py $url out.img
probing
getting
read
write
read
write
read
write
read
write
read
write
read
write
read
write
read
write
read
write
read
write
read
write
read
write
read
^CTraceback (most recent call last):
File "./t.py", line 18, in
@rbtcollins Just to confirm, are you testing with a proxy?
so it seems like requests and or urllib3 isn't actually closing the connection on close(), which may be by design. The same problem occurs with urllib2 if you do not close close the urllib2.urlopen handle. I opened squid bug at http://bugs.squid-cache.org/show_bug.cgi?id=3949 .
@Lukasa I misinterepreted and yes it was hanging.
The cause is as smoser says it's a squid bug when you have a concurrent request for one url with the first reader stalled neither closing nor reading.
This was caused because
cq = requests.get(sys.argv[1], stream=True) cq.raw.read(1024) cq.close()
doesn't actually close the socket for cq until you do 'del cq'. Thats a bug IMO.
@smoser Good spot. We very deliberately do not close the connection. We provide connection-pooling via urllib3, because opening a new connection each time is nuts. =D
This is absolutely not a bug in requests. As @Lukasa rightly states this is a very intentional design decision.
My reading of this issue is that we've resolved it as an awkward behaviour in SQUID. I'm closing this issue unless anyone disagrees.
We originally hit this under ubuntu bug http://pad.lv/1240652 . Running 'go-requests' (see below) like this:
results in a hang and eventually a timeout. The code is doing a open, then close, then open again of the same url, which seems to trigger the bug. Clearly that is a silly thing to do, but should not result in a hang.
I'm not certain if this is really a bug in urllib3 or in requests use of it.