puppylpg / oddish

Crawl csgo skin info from `buff.163.com` and steam, then find the most suitable one to buy from the former and to sell to the latter.
https://puppylpg.github.io/2019/12/07/python-crawler-buff-optimaze/
GNU General Public License v3.0
321 stars 80 forks source link

windows下新建steam价格历史cache文件时编码问题 #30

Closed beng3er closed 3 years ago

beng3er commented 3 years ago

问题描述

饰品名字里有特殊字符 windows新建文件默认编码是gbk,解析decode过的unicode编码时会报错

复现步骤

报错信息

2020-10-21 11:59:59,532 [INFO ] GET steam history price 3/3566 for (折刀(★) | 森林 DDPAT (破损不堪)): https://steamcommunity.com/market/pricehistory/?country=CN&currency=1&appid=730&market_hash_name=%E2%98%85%20Navaja%20Knife%20%7C%20Forest%20DDPAT%20%28Well-Worn%29 2020-10-21 11:59:59,533 [INFO ] sleep 4s at 2020-10-21 11:59:59.533948 2020-10-21 12:00:04,269 [ERROR] Traceback (most recent call last): File "c:\Users\Lin\Desktop\oddish\src\crawl\history_price_crawler.py", line 43, in crawl_history_price crawl_item_history_price(index, item, total_price_number) File "c:\Users\Lin\Desktop\oddish\src\crawl\history_price_crawler.py", line 15, in crawl_item_history_price steam_history_prices = get_json_dict(steam_price_url, steam_cookies, True) File "c:\Users\Lin\Desktop\oddish\src\util\requester.py", line 64, in get_json_dict store(url,json_data) File "c:\Users\Lin\Desktop\oddish\src\util\cache.py", line 33, in store f.write(data) UnicodeEncodeError: 'gbk' codec can't encode character '\xa5' in position 32: illegal multibyte sequence

src\util\cache.py,第32行改成 f = open(os.path.join(cache_root,urlid), "w",encoding='utf-8')就能过了

相关截屏(最好贴一下)

软件信息

请完善以下信息:

ccinv commented 3 years ago

的确只在linux下跑过测试,没想到windows还有这种编码问题

puppylpg commented 3 years ago

确实,读写文件不要使用系统默认编码,会引入不确定性。都写明utf8