Open SharkPika opened 7 months ago
这仓库都四年没维护了2333 如果让我维护我可能会选择用python重写(逃)
其实我想学学java怎么做爬虫才搜到你这个仓库的
润python吧 代码量起码能少一半 如果单论爬虫的话 无非就是生成请求然后去下载罢了 生成请求获取图片列表->遍历图片列表去下载 翻翻我这里源代码能找到我之前找的一些P站的API 像下面这些之类的
https://www.pixiv.net/ajax/search/artworks/{kw}?word={kw}&mode=safe&p=1&type=all&&lang=zh&s_mode=s_tag
https://www.pixiv.net/ajax/user/{uid}/profile/all?lang=zh
https://www.pixiv.net/ajax/user/{uid}/profile/illusts?ids[]={ids}work_category=illustManga&is_first_page=1&lang=zh
https://www.pixiv.net/ajax/illust/{pid}/pages?lang=zh
如果你想做p站的爬虫应该能用的上 这些api返回的都是json 解析一下就能找到自己想要的数据 具体点下面的链接就能懂了
https://www.pixiv.net/ajax/search/artworks/genshin?word=genshin&mode=r18&p=1&type=all&lang=zh&s_mode=s_type
https://www.pixiv.net/ajax/user/52542337/profile/all?lang=zh
https://www.pixiv.net/ajax/user/52542337/profile/illusts?ids[]=118201101&ids[]=117777980&work_category=illustManga&is_first_page=1&lang=zh
https://www.pixiv.net/ajax/illust/118201101/pages?lang=zh
就这么构造请求 去获取json 获取根据画师/关键字获取图片列表 再根据图片列表获取图片的url 然后下载就好
非常好的爬虫 但是4年后再用还是会有bug unexpected end of stream on null重新尝试获取 https://pixiv.net/ajax/search/artworks/genshin?word=genshin&mode=r18&p=1&type=all&lang=zh&s_mode=s_type unexpected end of stream on null重新尝试获取 https://pixiv.net/ajax/search/artworks/genshin?word=genshin&mode=r18&p=1&type=all&lang=zh&s_mode=s_type unexpected end of stream on null重新尝试获取 https://pixiv.net/ajax/search/artworks/genshin?word=genshin&mode=r18&p=1&type=all&lang=zh&s_mode=s_type unexpected end of stream on null重新尝试获取 https://pixiv.net/ajax/search/artworks/genshin?word=genshin&mode=r18&p=1&type=all&lang=zh&s_mode=s_type