Open Thisisnotgoingpublished opened 9 months ago
安装前调用emoji
pip install emoji
然后把 .\weibo\spiders\search.py 前面加入 import emoji
然后#掉倒数第二行 print(weibo) 改为下方内容
text_to_demj = weibo.get('text', '') clean_text = emoji.demojize(text_to_demj) print(clean_text)
或者 我不知道应该怎么写变成原来的输出 我不会编程 希望作者注意一下 我作为小白觉得应该把所有文本当做utf-8或者gbk,这样半落砢矶的不太好
不行 我无法了 它还是在报错
PS H:\weibo-search-master\weibo> scrapy crawl search -s JOBDIR=crawls/search >> ./a.txt
2023-12-13 11:27:43 [scrapy.core.scraper] ERROR: Spider error processing <GET https://s.weibo.com/weibo?q=<保密>&typeall=1&suball=1×cope=custom:2023-12-12-0:2023-12-13-0&page=1> (referer: https://s.weibo.com/weibo?q=<保密>&typeall=1&suball=1×cope=custom:2023-12-11-0:2023-12-14-0)
Traceback (most recent call last):
File "C:\Users\<保密>\AppData\Local\Programs\Python\Python39\lib\site-packages\scrapy\utils\defer.py", line 279, in iter_errback
yield next(it)
File "C:\Users\<保密>\AppData\Local\Programs\Python\Python39\lib\site-packages\scrapy\utils\python.py", line 350, in next
return next(self.data)
File "C:\Users\<保密>\AppData\Local\Programs\Python\Python39\lib\site-packages\scrapy\utils\python.py", line 350, in next
return next(self.data)
File "C:\Users\<保密>\AppData\Local\Programs\Python\Python39\lib\site-packages\scrapy\core\spidermw.py", line 106, in process_sync
for r in iterable:
File "C:\Users\<保密>\AppData\Local\Programs\Python\Python39\lib\site-packages\scrapy\spidermiddlewares\offsite.py", line 28, in
/* print(weibo) */
try:
print(str(weibo))
except UnicodeEncodeError as e:
print("Error occurred while encoding:", e)
yield {'weibo': weibo, 'keyword': keyword}
感谢热心反馈。我现在不方便调试,有时间会再调试下,感谢。
运行报错: UnicodeEncodeError: 'gbk' codec can't encode character '\U0001f525' in position 400: illegal multibyte sequence
处理字符时遇到了 Unicode 编码问题,'gbk' 编码不支持。字符 '\U0001f525' 是🔥表情符号。