heroku / heroku-buildpack-chrome-for-testing

Heroku Buildpack that installs Chrome for Testing
15 stars 10 forks source link

Anyway to specify exact versions? #13

Closed jstoxrocky closed 4 months ago

jstoxrocky commented 4 months ago

I use some hacky custom versions of Selenium that don't play well with latest versions of chrome/chrome driver. Anyway to specify exact older versions?

mars commented 4 months ago

No, it only supports selecting a release channel, but otherwise always installs the current Chrome & Chromedriver versions.

mars commented 4 months ago

We might be able to make this buildpack accept a specific version number config like CHROME_FOR_TESTING_VERSION, but Chrome for Testing only started being published last summer.

Can you confirm if the version you need is available through these endpoints?

jstoxrocky commented 4 months ago

@mars For added context I am using undetected chromedriver to scrape some sites. Unfortunatley, this library hangs when using latest chrome/chrome drivers versions.

The only way I have been able to get this setup to work and scrape my target sites is to use chrome and chrome driver versions: 116.0.5845.96

I download them both from here: chrome and chrome driver

My application runs in Heroku and I use the following build packs to support use of my custom chrome and chrome driver versions: chrome-buildpack and chrome-driver-buildpack

Hope this context helps, I would really love to get this all to work in a more sane way - but since my application depends on the scraping, and my scraping depends on the undetected-chromedriver lib, and that lib is a little hacky - I am backed into a corner with versions 116.0.5845.96

jstoxrocky commented 4 months ago

To answer your question - I don't see the version I am looking for in those endpoints. But the download URL follows the same pattern.

https://storage.googleapis.com/chrome-for-testing-public/116.0.5845.96/mac-arm64/chrome-mac-arm64.zip

https://storage.googleapis.com/chrome-for-testing-public/116.0.5845.96/mac-arm64/chromedriver-mac-arm64.zip

edmorley commented 4 months ago

For added context I am using undetected chromedriver to scrape some sites.

Gentle reminder that the Heroku AUP (linked from https://www.heroku.com/policy) has some stipulations around scraping that at first glance that tool doesn't seem to adhere to:

Customers may not use a service to, nor allow its users or any third party to use a service to: ... Access a third-party web property for the purposes of web scraping, web crawling, web monitoring, or other similar activity through a web client that does not take commercially reasonable efforts to identify itself via a unique User Agent string describing the purpose of the web client and obey the robots exclusion standard (also known as the robots.txt standard), including the crawl-delay directive;

mars commented 4 months ago

Indeed @edmorley is right, that "undetected chromedriver" library is not something we can support, because its whole purpose is avoiding web crawler restrictions.

jstoxrocky commented 4 months ago

@edmorley Thank you for letting me know that - I was unaware.

jstoxrocky commented 4 months ago

@mars well guess I can't advocate for my own use case anymore - but I still think being able to lock oneself to a specific version is advantageous - at least from a testing perspective. Someone else's legitimate (non web scraping) code might also break as a result of version changes.

sterlzbd commented 3 months ago

@mars Sorry to reopen this old ticket but I would like a way to pin an exact version as well. We've run into issues a number of times where a new chromedriver release for the stable channel has a bug in it. We then have to fork the repo and pin the download to the last stable release until the bug is fixed. Most recently this issue in chromedriver 123 and 124: https://bugs.chromium.org/p/chromedriver/issues/detail?id=4743&q=&can=1&sort=-id

I forked the old chromedriver buildpack at one point to support downloading it using the chrome for testing api when they made the switch at chrome 115. For backwards compatibility I made it respect CHROMEDRIVER_VERSION. That relied on using jq though to parse the json. Could we add that to this buildpack? I guess I'm not sure what the policy is for buildpack dependencies.

sfavrinlumint commented 3 months ago

We would also benefit from this functionality, no scraping here :)

sterlzbd commented 2 months ago

We've been waiting on a bug fix to land in stable for a while so in the meantime I added the ability to specify a major version (though really only since chrome 114) here: https://github.com/sterlzbd/heroku-buildpack-chrome-for-testing/tree/specify_version