lzjun567 / python_scripts

一些python相关的演示代码
Apache License 2.0
717 stars 528 forks source link

list index out of range #31

Open hanyunxuan opened 6 years ago

hanyunxuan commented 6 years ago

Traceback (most recent call last): File "crawler.py", line 163, in <module> crawler.run() File "crawler.py", line 90, in run for index, url in enumerate(self.parse_menu(self.request(self.start_url))): File "crawler.py", line 116, in parse_menu menu_tag = soup.find_all(class_="uk-nav uk-nav-side")[1]

wzming commented 6 years ago

同样出现了越界问题 Traceback (most recent call last): File "crawler.py", line 163, in crawler.run() File "crawler.py", line 90, in run for index, url in enumerate(self.parse_menu(self.request(self.start_url))): File "crawler.py", line 116, in parse_menu menu_tag = soup.findall(class="uk-nav uk-nav-side")[1] IndexError: list index out of range

daolanfler commented 6 years ago

在request 函数 return response那里加个断点,这时候response.content 的值为 ...503 Service Temporarily Unanaliable..,说明访问流量过大,list是空的。 我是这样理解的啊哈,但是我把源码下载到本地,oup.findall(class="uk-nav uk-nav-side")[1],还是报错,这一点我就不明白了。。。

afetmin commented 6 years ago

廖老师的网站有反爬技术,请求多了就给个503

fw6669998 commented 5 years ago

廖老师的网站有反爬技术,请求多了就给个503

在发送请求那儿加上个请求头就可以了 headers={ 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3704.400 QQBrowser/10.4.3588.400' } response = requests.get(url,headers=headers, **kwargs)