dennisfire / supercrawler

Automatically exported from code.google.com/p/supercrawler
0 stars 1 forks source link

HTTP Request timeout setup #4

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Just found python urllib2 api can't support the timeout when reading from 
HTTPResponse. I dug into the code in python lib just found there's two 
solution for this:

1. Modify python lib code. This is the easist way but make changes to 
standard library. Currently the lib just read 8K each time with infinite 
blocking time. I can set a timeout for that, it's the easist way. For 
example, set a timouet as 2s, so each read operation must finish in 2s 
which means the download speed must be faster than 4k/s.

2. Implement another HTTP handler.
This is a little complex but doable.

Original issue reported on code.google.com by zhangyunqiao@gmail.com on 5 Jan 2009 at 2:03