jklincn / PaperDownloader

使用 selenium 完成知网/万方数据库论文批量下载
MIT License
19 stars 5 forks source link

Timeoutexception #7

Closed yzl256 closed 2 weeks ago

yzl256 commented 2 weeks ago

======================================= 欢迎使用论文批量下载器 PaperDownloader 当前版本: 2.2.0

请选择使用的浏览器: 1: Google Chrome 2: Microsoft Edge

请输入一个整数: 1 使用浏览器: Google Chrome

查找可执行文件......成功

查找 WebDriver......成功

即将打开浏览器......321

请登录数据库网站进行内容检索, 在需要下载的论文前面打上勾,完成后输入回车键开始下载

正在接管浏览器控制,请不要操作。 检测到知网检索页面, 开始下载...... 检测到知网检索页面, 开始下载...... 程序发生异常,以下为异常信息

Traceback (most recent call last): File "D:\code\PaperDownloader-main\main.py", line 62, in core.download(browse.driver()) File "D:\code\PaperDownloader-main\core.py", line 115, in download cnki(driver) File "D:\code\PaperDownloader-main\core.py", line 153, in cnki WebDriverWait(driver, int(config.WaitTime)).until( File "C:\Users\admin.conda\envs\py39\lib\site-packages\selenium\webdriver\support\wait.py", line 105, in until raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message:

版本信息: PaperDownloader: 2.2.0 Python: 3.9.20 selenium: 4.25.0 ddddocr: 1.5.5 Google Chrome: 129.0.6668.101

在反馈时请务必提供以上所有异常信息,这将有助于问题分析

进程已结束,退出代码0

yzl256 commented 2 weeks ago

这是为什么?我觉得大佬的代码是没问题的,这个报错要怎么解决?

jklincn commented 2 weeks ago

我刚刚试了一下,并没有问题,你是怎么操作的?另外,可以试试较新版本的python,比如3.12

yzl256 commented 2 weeks ago

我用3.10的试一下,
for i in range(len(rows)):

Determine if selected

    # if not rows[i].find_element(By.CLASS_NAME, "ivu-checkbox-input").is_selected():
    #     continue
    # else:
        check_count += 1
        current_window_number = len(driver.window_handles)
        try:
            download_button = rows[i].find_element(
                By.CSS_SELECTOR,
                f"div:nth-child({str(i + 1)}) > .normal-list .t-DIB:nth-child(2) span",
            )
        except exceptions.NoSuchElementException:
            name = rows[i].find_element(By.CLASS_NAME, "title").text
            print(f"错误:不能下载 {name}。\n")
            continue

        download_button.click()
        # Switch to new window
        WebDriverWait(driver, int(config.WaitTime)).until(
            EC.number_of_windows_to_be(current_window_number + 1)
        )
        driver.switch_to.window(driver.window_handles[-1])
        if driver.title == "万方登录":
            print("错误:账号未登录或响应超时,下载中断。\n")
            return
        elif driver.title == "万方数据知识服务平台-无权限访问":
            print("错误:账号无权限。\n")
            return
        else:
            time.sleep(int(config.Interval))
            driver.close()
            # Switch back to index
            driver.switch_to.window(index_window)
            download_count += 1
            print(f"已经下载了{download_count}篇文献")

# 尝试找到下一页按钮并点击
try:
    next_button = WebDriverWait(driver, int(config.WaitTime)).until(
        EC.element_to_be_clickable((By.CSS_SELECTOR, 'span.next'))
    )
    driver.execute_script("arguments[0].scrollIntoView();", next_button)
    next_button.click()

    print("111111111111111111111111111111111111111")
    next_button = WebDriverWait(driver, int(config.WaitTime)).until(
            EC.element_to_be_clickable(
                (By.XPATH, '/html/body/div[5]/div/div[3]/div[2]/div/div[4]/div[2]/div[3]/span[9]')
            )
    )
    print("2222222222222222222222222222222222222")
    next_button.click()
except exceptions.TimeoutException:
    print("没有找到下一页按钮,结束爬取。")

翻页了但是for循环不会再下载新的内容,然后就报错了

jklincn commented 2 weeks ago

你这看起来是修改了代码,并且现在是尝试下载万方数据库吗?之前的问题是知网

yzl256 commented 2 weeks ago

知网的下载没问题了,谢谢大佬

jklincn commented 2 weeks ago

那现在还有什么问题吗,包括知网、万方的。没有问题的话我就关闭issue了

yzl256 commented 2 weeks ago

其他问题的,我在别的issue问吧

yzl256 commented 2 weeks ago

谢谢大佬