dataabc / weibo-crawler

新浪微博爬虫,用python爬取新浪微博数据,并下载微博图片和微博视频
3.41k stars 761 forks source link

显示被ban了很久了。 #238

Closed uuair closed 2 years ago

uuair commented 2 years ago

系统:ubuntu 21.10 python:3.9.7 weibo-crawler有无cookie都一样。 运行python weibo.py,直接显示被ban了,因为我用crontab运行,大约每周运行一次,所以有两个月的时间都是如此。 log文件:

2021-11-01 02:00:03,518 - INFO - 被ban了
2021-10-30 17:28:12,540 - INFO - 被ban了
2021-10-30 17:28:35,874 - INFO - 被ban了
2021-10-30 17:32:02,933 - INFO - 被ban了
2021-10-30 17:32:31,563 - INFO - 被ban了
2021-10-30 17:32:36,264 - INFO - 被ban了

config.json如下:

{
    "user_id_list": ["user_id_list.txt"],
    "filter": 0,
    "remove_html_tag": 1,
    "since_date": "2021-05-01",
    "start_page": 1,
    "write_mode": ["csv"],
    "original_pic_download": 1,
    "retweet_pic_download": 0,
    "original_video_download": 1,
    "retweet_video_download": 0,
    "download_comment":1,
    "comment_max_download_count":100,
    "result_dir_name": 0,
    "cookie": "",
    "mysql_config": {
        "host": "localhost",
        "port": 3306,
        "user": "root",
        "password": "123456",
        "charset": "utf8mb4"
    }
}

IP地址已经换过了,还是显示被ban了。加上cookie也是如此。 我尝试使用weibo_spider这个程序,则提示

list index out of range
Traceback (most recent call last):
  File "/home/weiboSpider/weibo_spider/parser/info_parser.py", line 39, in extract_user_info
    if self.selector.xpath(
IndexError: list index out of range
'NoneType' object has no attribute 'id'
Traceback (most recent call last):
  File "/home/weiboSpider/weibo_spider/parser/index_parser.py", line 36, in get_user
    self.user.id = user_id
AttributeError: 'NoneType' object has no attribute 'id'
None
****************************************************************************************************
'NoneType' object has no attribute 'nickname'
Traceback (most recent call last):
  File "/home/weiboSpider/weibo_spider/spider.py", line 226, in _get_filepath
    dir_name = self.user.nickname
AttributeError: 'NoneType' object has no attribute 'nickname'
expected str, bytes or os.PathLike object, not NoneType
Traceback (most recent call last):
  File "/home/weiboSpider/weibo_spider/writer/csv_writer.py", line 25, in __init__
    with open(self.file_path, 'a', encoding='utf-8-sig',
TypeError: expected str, bytes or os.PathLike object, not NoneType
'NoneType' object has no attribute 'nickname'
Traceback (most recent call last):
  File "/home/weiboSpider/weibo_spider/spider.py", line 226, in _get_filepath
    dir_name = self.user.nickname
AttributeError: 'NoneType' object has no attribute 'nickname'
'NoneType' object has no attribute 'nickname'
Traceback (most recent call last):
  File "/home/weiboSpider/weibo_spider/spider.py", line 226, in _get_filepath
    dir_name = self.user.nickname
AttributeError: 'NoneType' object has no attribute 'nickname'
'NoneType' object has no attribute 'nickname'
Traceback (most recent call last):
  File "/home/weiboSpider/weibo_spider/spider.py", line 226, in _get_filepath
    dir_name = self.user.nickname
AttributeError: 'NoneType' object has no attribute 'nickname'
'NoneType' object has no attribute 'nickname'
Traceback (most recent call last):
  File "/home/weiboSpider/weibo_spider/spider.py", line 226, in _get_filepath
    dir_name = self.user.nickname
AttributeError: 'NoneType' object has no attribute 'nickname'
'NoneType' object has no attribute '__dict__'
Traceback (most recent call last):
  File "/home/weiboSpider/weibo_spider/spider.py", line 313, in get_one_user
    self.write_user(self.user)
  File "/home/weiboSpider/weibo_spider/spider.py", line 137, in write_user
    writer.write_user(user)
  File "/home/weiboSpider/weibo_spider/writer/txt_writer.py", line 29, in write_user
    [v + ':' + str(self.user.__dict__[k]) for k, v in self.user_desc])
  File "/home/weiboSpider/weibo_spider/writer/txt_writer.py", line 29, in <listcomp>
    [v + ':' + str(self.user.__dict__[k]) for k, v in self.user_desc])
AttributeError: 'NoneType' object has no attribute '__dict__'
uuair commented 2 years ago

"user_id_list": "user_id_list.txt"

改成这样的写法了。

dataabc commented 2 years ago

一般被ban应该需要几天解除限制,过几天再看,或者换个机器换个账号。