jklincn / PaperDownloader

使用 selenium 完成知网/万方数据库论文批量下载
MIT License
19 stars 5 forks source link

目前在支持学位论文爬取的时候好像会报错 #6

Closed Jarvis636431 closed 3 months ago

Jarvis636431 commented 3 months ago

在知网上选取高级检索,选择学位论文(硕士,博士的时候)似乎爬取出现了失败,出现了如下报错

欢迎使用论文批量下载器 PaperDownloader! 当前版本: v2.1.0

请选择使用的浏览器: 1: Google Chrome 2: Microsoft Edge

请输入一个整数: 2 使用浏览器: Microsoft Edge

查找可执行文件......成功

查找 WebDriver......失败 尝试自动下载 WebDriver...成功

即将打开浏览器......321

请登录数据库网站进行内容检索, 在需要下载的论文前面打上勾,完成后输入回车键开始下载

正在接管浏览器控制,请不要操作。 检测到知网检索页面, 开始下载...... Traceback (most recent call last): File "C:\Users\86151\PaperDownloader\main.py", line 59, in core.download(browse.driver(), browse.name) File "C:\Users\86151\PaperDownloader\core.py", line 113, in download cnki(driver, browse_name) File "C:\Users\86151\PaperDownloader\core.py", line 155, in cnki WebDriverWait(driver, int(config.WaitTime)).until( File "C:\Users\86151\AppData\Roaming\Python\Python310\site-packages\selenium\webdriver\support\wait.py", line 105, in until raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message: Stacktrace: GetHandleVerifier [0x00007FF75EE68132+13538] Microsoft::Applications::Events::EventProperty::~EventProperty [0x00007FF75EDF1DE9+595465] (No symbol) [0x00007FF75EC0E6CF] (No symbol) [0x00007FF75EC52960] (No symbol) [0x00007FF75EC52A1F] (No symbol) [0x00007FF75EC8D627] (No symbol) [0x00007FF75EC7203F] (No symbol) [0x00007FF75EC48147] (No symbol) [0x00007FF75EC8B1EE] (No symbol) [0x00007FF75EC71C63] (No symbol) [0x00007FF75EC4766E] (No symbol) [0x00007FF75EC4683C] (No symbol) [0x00007FF75EC47221] Microsoft::Applications::Events::EventProperty::to_string [0x00007FF75F0296D4+1099860] Microsoft::Applications::Events::EventProperty::~EventProperty [0x00007FF75ED6D8FC+53532] Microsoft::Applications::Events::EventProperty::~EventProperty [0x00007FF75ED60E25+1605] Microsoft::Applications::Events::EventProperty::to_string [0x00007FF75F028665+1095653] Microsoft::Applications::Events::ILogConfiguration::operator [0x00007FF75EDFC961+27777] Microsoft::Applications::Events::ILogConfiguration::operator [0x00007FF75EDF6CE4+4100] Microsoft::Applications::Events::ILogConfiguration::operator* [0x00007FF75EDF6E1B+4411] Microsoft::Applications::Events::EventProperty::~EventProperty [0x00007FF75EDECFA0+575424] BaseThreadInitThunk [0x00007FFA865F257D+29] RtlUserThreadStart [0x00007FFA86A8AF28+40]

程序发生异常,以下为异常信息

版本信息: PaperDownloader: v2.1.0 Python: 3.10.14 selenium: 4.23.0 Microsoft Edge: 126.0.2592.113

在反馈时请务必提供以上所有异常信息,这将有助于问题分析

Jarvis636431 commented 3 months ago

现在已经解决:在学位论文检索页面,打开每个论文的二级页面,会发现内容为pdf下载的按钮名为“cajdown”,而代码中所检索的页面内容为pdfdown,故无法匹配,所以导致报错

jklincn commented 3 months ago

你好,确实存在这个问题,我修改了一下匹配逻辑。 commit 0bb55f4e

Jarvis636431 commented 3 months ago

感谢感谢!!!