Closed jzl543098871 closed 8 months ago
page是可迭代对象,直接for遍历page即可
def search():
# 站内搜索main_tag=0。
# 搜索第一页。
page: JmSearchPage = client.search_site(author, page=1)
# for循环遍历page即可
for aid, atitle, atags in page.iter_id_title_tag():
print(aid, atitle, atags, sep=',')
# 直接返回这一页的所有本子id
return list(page.iter_id())
另外试着用其他关键字运行了一下代码,有不少编码报错。 请把会导致报错的关键字提供给我
试着问了下chat 它告诉我用这个开头,但是用了之后下面run窗口全部都是乱码了
import sys
if sys.stdout.encoding != 'utf-8':
sys.stdout = open(sys.stdout.fileno(), mode='w', encoding='utf-8', buffering=1)
截取了一点窗口,虽然确实没报错了
python.exe C:\Users\子夜\PycharmProjects\JM\搜索作者.py
2023-12-03 17:34:56:銆恜lugin.invoke銆戣皟鐢ㄦ彃浠�: [login]
2023-12-03 17:34:56:銆恏tml銆慼ttps://18comic.vip/login
2023-12-03 17:34:56:銆恜lugin.login銆戠櫥褰曟垚鍔�
2023-12-03 17:34:56:銆恏tml銆慼ttps://18comic.vip/search/photos?main_tag=0&search_query=绋粯&page=1&o=mr&t=a
2023-12-03 17:34:58:銆恏tml銆慼ttps://18comic.vip/album/506940
2023-12-03 17:34:58:銆恏tml銆慼ttps://18comic.vip/album/431577
2023-12-03 17:34:58:銆恏tml銆慼ttps://18comic.vip/album/463953
2023-12-03 17:34:58:銆恆lbum.before銆戞湰瀛愯幏鍙栨垚鍔�: [444178], 浣滆��: [銇姐倠銇°兗銇玗, 绔犺妭鏁�: [1], 鎬婚〉鏁�: [47], 鏍囬: [[銇姐倠銇°兗銇玗鐖嗕钩鎸併仸浣欍仚娆叉眰涓嶆簚銇汉濡绘按娉炽偆銉炽偣銉堛儵銈偪銉笺仺鍗遍櫤鏃ョó浠樸亼銉堛儸銉笺儖銉炽偘[incomplete]], 鍏抽敭璇�: ['CG', '宸ㄤ钩', '浜哄', '涔充氦', '涓枃']
2023-12-03 17:34:58:銆恏tml銆慼ttps://18comic.vip/photo/444178
2023-12-03 17:34:58:銆恆lbum.before銆戞湰瀛愯幏鍙栨垚鍔�: [459602], 浣滆��: [銇°倱], 绔犺妭鏁�: [1], 鎬婚〉鏁�: [188], 鏍囬: [[銇°倱] 绋粯銇�! 銉椼儸銈� 銉椼儸銈� 銉椼儸銈� [DL鐗圿], 鍏抽敭璇�: ['宸ㄤ钩', '澶氭瘺', '鍌湢', '涓嚭', '寮锋毚', '閬庤啙瑗�', '鍑烘睏', '缇や氦', '鍠鏈�', '鏃ユ枃']
2023-12-03 17:34:58:銆恏tml銆慼ttps://18comic.vip/photo/459602
2023-12-03 17:34:59:銆恆lbum.before銆戞湰瀛愯幏鍙栨垚鍔�: [469447], 浣滆��: [鏈ㄩ埓銈偙銉玗, 绔犺妭鏁�: [1], 鎬婚〉鏁�: [26], 鏍囬: [鎾ó姝愬悏妗戠殑JK鐢熷皬瀛㏒EX [鏈ㄩ埓浜� (鏈ㄩ埓銈偙銉�)] 绋粯銇娿仒銇曘倱銇甁K瀛愪綔銈奡EX [涓浗缈昏ǔ] [鐒′慨姝 [DL鐗圿], 鍏抽敭璇�: ['鐒′慨姝�', '闃块粦椤�', '鎬у嫆绱�', '鍏斿コ閮�', '铏曞コ', '钘ョ墿', '涓嚭', '鍏ц。', '寮锋毚', '鏍℃湇', '閬庤啙瑗�', '闆欓Μ灏�', '閫忚', '涓枃']
2023-12-03 17:34:59:銆恏tml銆慼ttps://18comic.vip/photo/469447
大概就是这样的,总感觉不对
可能是你控制台的编码是GBK,Python标准输出的编码是UTF8。属于你本地环境的问题了,你可以多调调试试看
另外试着用其他关键字运行了一下代码,有不少编码报错。 请把会导致报错的关键字提供给我
我这边用的就是主楼的那段代码,就是简单的吧原来的作者名称换成‘種付’了
不加上面提的那段import的话,报错如下
Exception in thread Thread-80 (<lambda>):
Traceback (most recent call last):
File "C:\Python\lib\site-packages\jmcomic\api.py", line 48, in download_album
dler.download_album(jm_album_id)
File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 59, in download_album
self.download_by_album_detail(album, client)
File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 62, in download_by_album_detail
self.before_album(album)
File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 168, in before_album
super().before_album(album)
File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 8, in before_album
jm_log('album.before',
File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 279, in jm_log
cls.executor_log(topic, msg)
File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 6, in default_jm_logging
print(f'{format_ts()}:【{topic}】{msg}')
UnicodeEncodeError: 'gbk' codec can't encode character '\u301c' in position 116: illegal multibyte sequence
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Python\lib\threading.py", line 1016, in _bootstrap_inner
self.run()
File "C:\Python\lib\threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "C:\Python\lib\site-packages\jmcomic\api.py", line 29, in <lambda>
apply_each_obj_func=lambda aid: download_api(aid, option, downloader),
File "C:\Python\lib\site-packages\jmcomic\api.py", line 47, in download_album
with new_downloader(option, downloader) as dler:
File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 198, in __exit__
jm_log('dler.exception',
File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 279, in jm_log
cls.executor_log(topic, msg)
File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 6, in default_jm_logging
print(f'{format_ts()}:【{topic}】{msg}')
UnicodeEncodeError: 'gbk' codec can't encode character '\u301c' in position 244: illegal multibyte sequence
可能是你控制台的编码是GBK,Python标准输出的编码是UTF8。属于你本地环境的问题了,你可以多调调试试看
好的,这个应该是我个人的问题,我自己调试一下,麻烦大佬了。
另外试着用其他关键字运行了一下代码,有不少编码报错。 请把会导致报错的关键字提供给我
我这边用的就是主楼的那段代码,就是简单的吧原来的作者名称换成‘種付’了
不加上面提的那段import的话,报错如下
Exception in thread Thread-80 (<lambda>): Traceback (most recent call last): File "C:\Python\lib\site-packages\jmcomic\api.py", line 48, in download_album dler.download_album(jm_album_id) File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 59, in download_album self.download_by_album_detail(album, client) File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 62, in download_by_album_detail self.before_album(album) File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 168, in before_album super().before_album(album) File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 8, in before_album jm_log('album.before', File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 279, in jm_log cls.executor_log(topic, msg) File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 6, in default_jm_logging print(f'{format_ts()}:【{topic}】{msg}') UnicodeEncodeError: 'gbk' codec can't encode character '\u301c' in position 116: illegal multibyte sequence During handling of the above exception, another exception occurred: Traceback (most recent call last): File "C:\Python\lib\threading.py", line 1016, in _bootstrap_inner self.run() File "C:\Python\lib\threading.py", line 953, in run self._target(*self._args, **self._kwargs) File "C:\Python\lib\site-packages\jmcomic\api.py", line 29, in <lambda> apply_each_obj_func=lambda aid: download_api(aid, option, downloader), File "C:\Python\lib\site-packages\jmcomic\api.py", line 47, in download_album with new_downloader(option, downloader) as dler: File "C:\Python\lib\site-packages\jmcomic\jm_downloader.py", line 198, in __exit__ jm_log('dler.exception', File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 279, in jm_log cls.executor_log(topic, msg) File "C:\Python\lib\site-packages\jmcomic\jm_config.py", line 6, in default_jm_logging print(f'{format_ts()}:【{topic}】{msg}') UnicodeEncodeError: 'gbk' codec can't encode character '\u301c' in position 244: illegal multibyte sequence
这个是Python的标准输出流编码不对的问题,尝试使用如下代码,看看输出是什么
import sys
print(sys.stdout.encoding)
如果输出结果不是UTF-8,把下面的代码插入到你代码的开头
import io
import sys
sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding='utf-8')
控制台的编码
感谢,正常了,多嘴问一句option的设置,使用clash做代理的话,以下设置是哪里出错了,我开启clash选中代理节点但是不启用系统代理仍然无法连接到18comic.vip
client:
cache: true
domain: [ 18comic.vip ]
impl: html
postman:
meta_data:
headers: null
impersonate: chrome110
proxies: { clash }
控制台的编码
感谢,正常了,多嘴问一句option的设置,使用clash做代理的话,以下设置是哪里出错了,我开启clash选中代理节点但是不启用系统代理仍然无法连接到18comic.vip
client: cache: true domain: [ 18comic.vip ] impl: html postman: meta_data: headers: null impersonate: chrome110 proxies: { clash }
client: domain: [ 18comic.vip ] impl: html postman: meta_data: proxies: clash # 改这里
如果你的clash开了系统代理,那么配置可以简化
client:
domain: [ 18comic.vip ]
impl: html
收到,感谢大佬细心讲解。我就是希望不开系统代理,平时一般也是不开全局代理的,不然逛nga或者贴吧的话,还得单独设置规则
其实作者已经详单完善了,基本上除了找某个固定本子之外,都是搜索作者名称或者tag进行搜索的,只是作者给的示例代码去掉了tag搜索中for循环的if语句,在下载前输出搜索结果。假如我想实现这个功能(主要用于注释掉download看搜索结果的话),是在download(search())之前增加一个for循环吗?
另外试着用其他关键字运行了一下代码,有不少编码报错,应该是gkb和utf-8的编码的问题,这个在Google上看到一堆解决的,反而不知道该怎么写了
随便截取了一个报错的编码