itstyren / CNKI-download

:frog: 知网(CNKI)文献下载及文献速览爬虫
MIT License
510 stars 148 forks source link

知网反爬 #48

Open Topofme opened 1 year ago

Topofme commented 1 year ago

知网改了网页源代码,将搜索后包含内容的

进行了隐藏,爬取的网页源代码中无检索的结果,报错: Traceback (most recent call last): File "D:\code\CNKI-download-master (1)\CNKI-download-master\main.py", line 263, in main() File "D:\code\CNKI-download-master (1)\CNKI-download-master\main.py", line 257, in main search.search_reference(get_uesr_inpt()) File "D:\code\CNKI-download-master (1)\CNKI-download-master\main.py", line 100, in search_reference self.pre_parse_page(second_get_res.text), second_get_res.text) File "D:\code\CNKI-download-master (1)\CNKI-download-master\main.py", line 110, in pre_parse_page reference_num = re.search(reference_num_pattern_compile, AttributeError: 'NoneType' object has no attribute 'group' 正则表达式无法检索到匹配项,返回None导致group()方法报错 知网更改了
MajexH commented 9 months ago

这个地方稍微改改就能过了 正则中间的部分 替换乘 r'找到(.*?)条结果' 既可 不过后面的代码也不行了 知网彻底改了查询的结构 得重写才行了