lnln1111 / gappproxy

Automatically exported from code.google.com/p/gappproxy
GNU General Public License v3.0
0 stars 1 forks source link

关于无法处理302等HTTP重定向请求 #3

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
无法处理302等HTTP重定向请求,原因是GAE的urlfetch从1.1版(当前运
行版)开始不回
传重定向响应,这个目前还没有办法绕过去. 

我想可以判断目标http返回的http 
status来处理,检测到301,302的返回代码后检查
header里面的location 
key取得需要redirect的url然后再去fetch,下面是我以前写的
一段代码希望能够有帮助

def do_analyze_header( u='https://www.x-ways.net/winhex.zip' ):
    '''
    in this version only check http status code redirect, like 302, 301
    @param u: http url
    '''

    protocol = urlparse.urlparse( u )[0]
    _i_host, _i_path = split_url( u )
    if protocol.lower() == 'http':
        conn = httplib.HTTPConnection( _i_host )
    elif protocol.lower() == 'https':
        conn = httplib.HTTPSConnection( _i_host )
    else:
        raise BotError, 'Unknow protocol'
    my_request_header = {'User-Agent':'Mozilla/5.0'}
    try:
        conn.request( "GET", _i_path, headers=my_request_header )
    except socket.error:
        raise BotError, 'Connect Error'
    r1 = conn.getresponse()
    if r1.status == 200:
        conn.close()
        return r1, u
    elif r1.status == 301 or r1.status == 302:
        ru = r1.getheader( 'location' )
        conn.close()
        return do_analyze_header( ru )
    else:
        raise BotError, 'Read http header error'

Original issue reported on code.google.com by fla.sam on 1 Jul 2008 at 3:51

GoogleCodeExporter commented 9 years ago
谢谢,但GAE的运行环境不能使用httplib/urlib,只能用GAE提供的urlfe
tch API,而这个API函数不
回传302等,甚至也不回传重定向后的Location.

Original comment by dug...@188.com on 2 Jul 2008 at 4:29

GoogleCodeExporter commented 9 years ago
http://code.google.com/appengine/docs/urlfetch/responseobjects.html
这里介绍的有函数可以抓到返回的status code 和 header,
status_code
    The HTTP status code.
headers
    The HTTP response headers, as a mapping of names to values.

Original comment by fla.sam on 2 Jul 2008 at 4:00

GoogleCodeExporter commented 9 years ago
你往前翻一页就看到了:)

http://code.google.com/appengine/docs/urlfetch/overview.html

特别是这句话: fetch() follows HTTP redirects up to 5 times, and returns 
the final
resource.

Original comment by dug...@188.com on 3 Jul 2008 at 9:44

GoogleCodeExporter commented 9 years ago
status_code不包含302等重定向响应,overview里面已经说明白了

Original comment by dug...@188.com on 3 Jul 2008 at 9:46

GoogleCodeExporter commented 9 years ago
http://groups.google.com/group/gappproxy/browse_thread/thread/2dffd6b1f23cf786

Original comment by lovelywcm on 14 Jan 2009 at 6:08