leoncvlt / blinkist-scraper

📚 Python tool to download book summaries and audio from Blinkist.com, and generate some pretty output
190 stars 36 forks source link

Using chromedriver-binary to delete binaries and support linux #14

Closed jvalecillos closed 4 years ago

jvalecillos commented 4 years ago

What?

Added Linux support by using chromedriver-binary library.

Why?

For a consistent ChromeDriver usage.

leoncvlt commented 4 years ago

That's an interesting library, I didn't know about it 🙂Though I don't think having it as a dependency is a good idea, I had a look at the sourcecode and it seems like the author hardcoded the downloaded chromedriver version to 83.0.4103.39 - works fine for now, but what happens when Chrome gets updated and there are new versions for both?

Albeit this is something to fix in the script as well since right now it's technically still hardcoded, given that the chromedriver executable is included in the repo.

I think the best way to approach this would be to have the user download the appropriate chromedriver distribution for their system, maybe include a --chromdriver-path argument in case they want to point to an existing distribution on your system, and having the possibility of the script automatically downloading the latest chromedriver version from https://chromedriver.storage.googleapis.com/index.html in case no distribution is detected in the user's system, just to keep the script as easy to use as possible.

jvalecillos commented 4 years ago

Hello @leoncvlt, thanks for the quick response.

I initially thought about simply adding the instructions for downloading the ChromeDriver depending on the platform. Simple but easy.

There are other alternatives to download the driver as well like yeongbin-jo/python-chromedriver-autoinstaller) or rasjani/webdrivermanager. The difference is that the other is more used and maintained :man_shrugging:

leoncvlt commented 4 years ago

Implemented in b044e46d728b402216e5ff211ca7c4396e4bb2d2 😃

rocketinventor commented 4 years ago

@jvalecillos Having the latest chromedriver (that matches the current browser?) automatically downloaded is a cool idea- but might this lead to unnecessary bloat (outside scope) and duplicity? For example, the script might not see that it is already installed somewhere, and then the chromedriver will be installed twice. Not that this scenario would be so different from how it is right now, with the binaries being included in the repository... It is already difficult to detect chromedriver installed properly on the path variable (which this project does not do). As far as I know, there is no official location for storing the chromedriver binaries.

Do either of those scripts detect and update according to the installed web-browser? Or automatically install/update the corresponding web-browser? I think the user can take care of that.