Flameish / Novel-Grabber

Novel-Grabber can download novels from pretty much any webnovel and lightnovel site.
MIT License
468 stars 62 forks source link

[BUG] Headerless browser broken #371

Open non-Jedi opened 1 year ago

non-Jedi commented 1 year ago

I noticed this a few months ago. I'm assuming some dependency changed their api to break Novel-Grabber.

Describe the bug

When the "Use Headerless Browser" option is checked under the "Manual" interface, Novel-Grabber fails to download anything. I have the browser to use set to firefox, and as far as I can tell, no firefox instance is actually started.

If I run the program as java -jar bin/Novel-Grabber.jar, I see the following in stderr which I believe probably is the best clue toward the issue:

109336 [pool-4-thread-1] INFO io.github.bonigarcia.wdm.WebDriverManager - Using geckodriver 0.33.0 (resolved driver for Firefox 114)
109337 [pool-4-thread-1] INFO io.github.bonigarcia.wdm.WebDriverManager - Exporting webdriver.gecko.driver as /home/adam/.cache/selenium/geckodriver/linux64/0.33.0/geckodriver
/home/adam/.cache/selenium/geckodriver/linux64/0.33.0/geckodriver: 1: Syntax error: word unexpected (expecting ")")


openjdk version "1.8.0_322"
OpenJDK Runtime Environment (build 1.8.0_322-b04)
OpenJDK 64-Bit Server VM (build 25.322-b04, mixed mode)

Logs The "Log" tab shows:

[INFO]Starting browser...
[ERROR]java.net.ConnectException: Failed to connect to localhost/0:0:0:0:0:0:0:1:7088
Build info: version: 'unknown', revision: 'unknown', time: 'unknown'
System info: host: 'Unknown', ip: 'Unknown', os.name: 'Linux', os.arch: 'amd64', os.version: '6.1.31_1', java.version: '1.8.0_322'
Driver info: driver.version: Driver
Flameish commented 1 year ago

I've updated all dependencies. Please update and try again.

non-Jedi commented 1 year ago

Hm. I'm still having the same issue on 3.10.3 with the same output on stderr and in the log.

Flameish commented 1 year ago

Is this only for a specific site or all? It seems to be like a generic Linux error rather than a java one, not sure where to begin on this one. (It works perfectly fine on my linux machine). From a quick round of googling, someone mentioned getting that error when running a script via sh instead of bash, through I don't think this applies in this case.

non-Jedi commented 1 year ago

This is for all sites. I'm working on getting chrome installed to see if I have the same issue there. My bet is that it's an issue with the binary webdrivermanager downloads to ~/.cache/selenium/geckodriver/linux64/0.33.0/geckodriver.

non-Jedi commented 1 year ago

And of course using chrome gives me a completely different error since apparently chrome 113 is far too old. Blah.

3333825 [pool-5-thread-1] INFO io.github.bonigarcia.wdm.WebDriverManager - Reading https://chromedriver.storage.googleapis.com/ to seek chromedriver
3334323 [pool-5-thread-1] INFO io.github.bonigarcia.wdm.online.Downloader - Downloading https://chromedriver.storage.googleapis.com/114.0.5735.90/chromedriver_linux64.zip
3335915 [pool-5-thread-1] INFO io.github.bonigarcia.wdm.online.Downloader - Extracting driver from compressed file chromedriver_linux64.zip
3336056 [pool-5-thread-1] INFO io.github.bonigarcia.wdm.WebDriverManager - Exporting webdriver.chrome.driver as /home/adam/.cache/selenium/chromedriver/linux64/114.0.5735.90/chromedriver
Starting ChromeDriver 114.0.5735.90 (386bc09e8f4f2e025eddae123f36f6263096ae49-refs/branch-heads/5735@{#1052}) on port 29917
Only local connections are allowed.
Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
ChromeDriver was started successfully.
[1687112752.200][WARNING]: Deprecated chrome option is ignored: useAutomationExtension
[1687112752.200][WARNING]: Deprecated chrome option is ignored: useAutomationExtension
[ERROR]session not created: This version of ChromeDriver only supports Chrome version 114
Current browser version is 113.0.5672.126 with binary path /opt/google/chrome/google-chrome
Build info: version: 'unknown', revision: 'unknown', time: 'unknown'
System info: host: 'Unknown', ip: 'Unknown', os.name: 'Linux', os.arch: 'amd64', os.version: '6.1.31_1', java.version: '1.8.0_322'
Driver info: driver.version: Driver
remote stacktrace: #0 0x5558a101d4e3 <unknown>
#1 0x5558a0d4cc76 <unknown>
#2 0x5558a0d7a04a <unknown>
#3 0x5558a0d754a1 <unknown>
#4 0x5558a0d72029 <unknown>
#5 0x5558a0db0ccc <unknown>
#6 0x5558a0db047f <unknown>
#7 0x5558a0da7de3 <unknown>
#8 0x5558a0d7d2dd <unknown>
#9 0x5558a0d7e34e <unknown>
#10 0x5558a0fdd3e4 <unknown>
#11 0x5558a0fe13d7 <unknown>
#12 0x5558a0febb20 <unknown>
#13 0x5558a0fe2023 <unknown>
#14 0x5558a0fb01aa <unknown>
#15 0x5558a10066b8 <unknown>
#16 0x5558a1006847 <unknown>
#17 0x5558a1016243 <unknown>
#18 0x7fb9bd7d9afa start_thread
non-Jedi commented 1 year ago

Okay. Once I got chrome >=114 installed, headerless actually worked. So this error is specific to firefox 114.0 on linux and potentially even more specific than that. What version of firefox do you have installed @Flameish that works for you?

Flameish commented 1 year ago

114.0 on Fedora 38 :)

non-Jedi commented 1 year ago

Hm. And can you check that you end up using the same geckodriver as me?

md5sum ~/.cache/selenium/geckodriver/linux64/0.33.0/geckodriver 
1d3329b4f09f1e466af61860d1847346  /home/adam/.cache/selenium/geckodriver/linux64/0.33.0/geckodriver
Flameish commented 1 year ago

Nope, mine is 9ec7106c61fcc56916417baad150ef59

non-Jedi commented 1 year ago

So it looks like it's downloading the aarch64 version of geckodriver for me which certainly explains the issue. I tried just deleting ~/.cach/selenium to see if it got it right on a second attempt, but no dice.

file ~/.cache/selenium/geckodriver/linux64/0.33.0/geckodriver
/home/adam/.cache/selenium/geckodriver/linux64/0.33.0/geckodriver: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), statically linked, with debug_info, not stripped

I'm assuming this is an upstream bug with webdrivermanager incorrectly believing my x86_64 install of Void Linux is aarch64. Could you help me file the issue there, please? I'm not fluent in Java.

I appreciate you going back and forth with me to debug this.