dataabc / weibo-search

获取微博搜索结果信息,搜索即可以是微博关键词搜索,也可以是微博话题搜索
1.7k stars 372 forks source link

爬到一半,爬不动了 #250

Open gudugudu-debug opened 2 years ago

gudugudu-debug commented 2 years ago

是这样显示

git:2022-08-02 23:57:40 [scrapy.core.scraper] ERROR: Spider error processing <GET https://s.weibo.com/weibo?q=%23%E6%98%93%E7%83%8A%E5%8D%83%E7%8E%BA%E8%80%83%E7%BC%96%23&region=custom:65:40&typeall=1&suball=1&timescope=custom:2022-07-07-17:2022-07-07-18&page=1> (referer: https://s.weibo.com/weibo?q=%23%E6%98%93%E7%83%8A%E5%8D%83%E7%8E%BA%E8%80%83%E7%BC%96%23&region=custom:65:1000&typeall=1&suball=1&timescope=custom:2022-07-07-17:2022-07-07-18&page=1)
Traceback (most recent call last):
  File "D:\Python 3.10\lib\site-packages\scrapy\utils\defer.py", line 132, in iter_errback
    yield next(it)
  File "D:\Python 3.10\lib\site-packages\scrapy\utils\python.py", line 354, in __next__
    return next(self.data)
  File "D:\Python 3.10\lib\site-packages\scrapy\utils\python.py", line 354, in __next__
    return next(self.data)
  File "D:\Python 3.10\lib\site-packages\scrapy\core\spidermw.py", line 66, in _evaluate_iterable
    for r in iterable:
  File "D:\Python 3.10\lib\site-packages\scrapy\spidermiddlewares\offsite.py", line 29, in process_spider_output
    for x in result:
  File "D:\Python 3.10\lib\site-packages\scrapy\core\spidermw.py", line 66, in _evaluate_iterable
    for r in iterable:
  File "D:\Python 3.10\lib\site-packages\scrapy\spidermiddlewares\referer.py", line 342, in <genexpr>
    return (_set_referer(r) for r in result or ())
  File "D:\Python 3.10\lib\site-packages\scrapy\core\spidermw.py", line 66, in _evaluate_iterable
    for r in iterable:
  File "D:\Python 3.10\lib\site-packages\scrapy\spidermiddlewares\urllength.py", line 40, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "D:\Python 3.10\lib\site-packages\scrapy\core\spidermw.py", line 66, in _evaluate_iterable
    for r in iterable:
  File "D:\Python 3.10\lib\site-packages\scrapy\spidermiddlewares\depth.py", line 58, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "D:\Python 3.10\lib\site-packages\scrapy\core\spidermw.py", line 66, in _evaluate_iterable
    for r in iterable:
  File "E:\▒ĵ▒\Python\weibo\weibo-search\weibo\spiders\search.py", line 276, in parse_page
    for weibo in self.parse_weibo(response):
  File "E:\▒ĵ▒\Python\weibo\weibo-search\weibo\spiders\search.py", line 520, in parse_weibo
    print(weibo)
UnicodeEncodeError: 'gbk' codec can't encode character '\U0001f644' in position 367: illegal multibyte sequence
dataabc commented 2 years ago

感谢反馈。我最近不方便调试,有时间看看,看起来是编码原因,也可能是cmd工具导致的。

cloudy-sfu commented 7 months ago

你 E 盘的那个文件目录里,那个 ?j? 的文件夹名称有问题,你的命令行是 gbk,的大概率 Windows 中文系统,那么文件夹名字就保持在中英文的范围内。