super-l / superl-url

根据关键词,对搜索引擎内容检索结果的网址内容进行采集的一款程序。可自动从多个搜索引擎采集相关网站的真实地址与标题等信息,可保存为文件,自动去除重复URL。同时,也可以自定义忽略多条域名等。
http://www.msray.net/doc
618 stars 145 forks source link

win10 py3 运行时报错误 #8

Closed Eisx11o closed 5 years ago

Eisx11o commented 5 years ago

please input keyword:资讯 Search Number of pages:20 []Search Engine [baidu] starting!The number of display bars per page is 50 []Search Engine [sougou] starting!The number of display bars per page is 50 []Search Engine [so] starting!The number of display bars per page is 10 []Search Engine [sougou],Page [1] Start collecting. Process Process-2: Traceback (most recent call last): File "C:\Python37\lib\multiprocessing\process.py", line 297, in _bootstrap self.run() File "C:\Python37\lib\multiprocessing\process.py", line 99, in run self._target(*self._args, self._kwargs) File "C:\Users\eLisx\Desktop\superl-url\core\collect.py", line 58, in init self.collection() File "C:\Users\eLisx\Desktop\superl-url\core\collect.py", line 78, in collection my_sougou = Sougou(self.outfile) File "C:\Users\eLisx\Desktop\superl-url\module\sougou\sougou.py", line 37, in init Engine.init(self, search_name, outfile) File "C:\Users\eLisx\Desktop\superl-url\module\engine.py", line 37, in init self.filter = Filter() File "C:\Users\eLisx\Desktop\superl-url\core\filter.py", line 43, in init self.filterTitleList = self.get_filtertitle() File "C:\Users\eLisx\Desktop\superl-url\core\filter.py", line 89, in get_filtertitle file_context = file_object.read().decode("utf-8") UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 18: illegal multibyte sequence []Search Engine [baidu],Page [1] Start collecting. Process Process-1: Traceback (most recent call last): File "C:\Python37\lib\multiprocessing\process.py", line 297, in _bootstrap self.run() File "C:\Python37\lib\multiprocessing\process.py", line 99, in run self._target(self._args, self._kwargs) File "C:\Users\eLisx\Desktop\superl-url\core\collect.py", line 58, in init self.collection() File "C:\Users\eLisx\Desktop\superl-url\core\collect.py", line 70, in collection my_baidu = Baidu(self.outfile) File "C:\Users\eLisx\Desktop\superl-url\module\baidu\baidu.py", line 39, in init Engine.init(self, search_name, outfile) File "C:\Users\eLisx\Desktop\superl-url\module\engine.py", line 37, in init self.filter = Filter() File "C:\Users\eLisx\Desktop\superl-url\core\filter.py", line 43, in init self.filterTitleList = self.get_filtertitle() File "C:\Users\eLisx\Desktop\superl-url\core\filter.py", line 89, in get_filtertitle file_context = file_object.read().decode("utf-8") UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 18: illegal multibyte sequence []Search Engine [so],Page [1] Start collecting. Process Process-3: Traceback (most recent call last): File "C:\Python37\lib\multiprocessing\process.py", line 297, in _bootstrap self.run() File "C:\Python37\lib\multiprocessing\process.py", line 99, in run self._target(self._args, *self._kwargs) File "C:\Users\eLisx\Desktop\superl-url\core\collect.py", line 58, in init self.collection() File "C:\Users\eLisx\Desktop\superl-url\core\collect.py", line 74, in collection my_so = So(self.outfile) File "C:\Users\eLisx\Desktop\superl-url\module\so\so.py", line 37, in init Engine.init(self, search_name, outfile) File "C:\Users\eLisx\Desktop\superl-url\module\engine.py", line 37, in init self.filter = Filter() File "C:\Users\eLisx\Desktop\superl-url\core\filter.py", line 43, in init self.filterTitleList = self.get_filtertitle() File "C:\Users\eLisx\Desktop\superl-url\core\filter.py", line 89, in get_filtertitle file_context = file_object.read().decode("utf-8") UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 18: illegal multibyte sequence [1;33;40[]The url collection task is complete! runs in 14 seconds

ghost commented 5 years ago

把文件用记事本打开,然后另存为,下面编码选ANSI就好。