dataabc / weibo-crawler

新浪微博爬虫,用python爬取新浪微博数据,并下载微博图片和微博视频
3.33k stars 744 forks source link

您好,有几条微博在爬取的时候会报错,请问怎么解决呢 #328

Open 12414123123123 opened 1 year ago

12414123123123 commented 1 year ago

报错1 list index out of range Traceback (most recent call last): File "D:\demo\weibo-crawler\weibo.py", line 1075, in get_one_page w = w.get("card_group",[0])[0] or w IndexError: list index out of range

报错2 string indices must be integers Traceback (most recent call last): File "D:\demo\weibo-crawler\weibo.py", line 833, in get_one_weibo weibo = self.parse_weibo(weibo_info) File "D:\demo\weibo-crawler\weibo.py", line 732, in parse_weibo weibo["pics"] = self.get_pics(weibo_info) File "D:\demo\weibo-crawler\weibo.py", line 414, in get_pics pic_list = [pic["large"]["url"] for pic in pic_info] File "D:\demo\weibo-crawler\weibo.py", line 414, in pic_list = [pic["large"]["url"] for pic in pic_info] TypeError: string indices must be integers

dataabc commented 1 year ago

是不是没加cookie,或者cookie过期了?

12414123123123 commented 1 year ago

是不是没加cookie,或者cookie过期了?

谢谢大佬 刚才检查了一下cookie 发现失效了 我这条cookie是昨天加的 失效原因是因为我退出了这条cookie的账号吗?

dataabc commented 1 year ago

这个我也不确定。

12414123123123 commented 1 year ago

这个我也不确定。

我更新cookie后更换了几个对象试了一下 还是有几条微博会报错 例如https://weibo.com/2616380702/Ljh9XsRJl https://weibo.com/7556183438/M9u7BtfTj list index out of range Traceback (most recent call last): File "D:\demo\weibo-crawler\weibo.py", line 1075, in get_one_page w = w.get("card_group",[0])[0] or w IndexError: list index out of range

dataabc commented 1 year ago

这部分代码是网友贡献的,用来检测cookie,如果报错可能cookie不正确,我现在不方便调试,不确定是代码错误还是cookie不正确,您如果方便可以使用现在的cookie在项目weibospider上,这个如果运行正确就是cookie没问题。

12414123123123 commented 1 year ago

感谢大佬回复,我换到spider上使用没有出现问题