pyronlaboratory / heroku-integrated-firefox-geckodriver

Buildpack enables your client code to access Firefox along with Geckodriver in a Heroku slug.
https://pyronlaboratory.github.io/heroku-integrated-firefox-geckodriver/
MIT License
41 stars 80 forks source link
firefox geckodriver heroku heroku-buildpack selenium-driver

heroku-integrated-firefox-geckodriver

Build Status License: MIT

Now supports Heroku-20, and legacy versions Heroku-16 and Heroku-18.

Available for download at The Heroku Elements Marketplace. Search heroku-integrated-firefox-geckodriver on the official Heroku Elements Marketplace to access the buildpack.

Buildpack heroku-integrated-firefox-geckodriver enables your application or client code - running in any high-level language such as Python, Ruby or Node.js - to access Firefox along with Geckodriver (the Selenium driver for Firefox) in a Heroku slug and enables the driver objects to perform automated operations defined in the source code.

Version compatibility as follows:

Installation:

To install and integrate the buildpack with your application running on Heroku's dyno:

$ heroku create --buildpack https://github.com/pyronlaboratory/heroku-integrated-firefox-geckodriver

# or if your app is already created:
$ heroku buildpacks:add https://github.com/pyronlaboratory/heroku-integrated-firefox-geckodriver

$ git push heroku master

Configurations:

Update Heroku's environment variables to store the following path strings.

FIREFOX_BIN: /app/vendor/firefox/firefox

Alternatively, you can even use /app/vendor/firefox/firefox-bin

GECKODRIVER_PATH: /app/vendor/geckodriver/geckodriver

LD_LIBRARY_PATH: /usr/local/lib:/usr/lib:/lib:/app/vendor

PATH: /usr/local/bin:/usr/bin:/bin:/app/vendor/

These configuration variables can be updated via Heroku CLI as follows:

Executable command: heroku config:set <ENV_VARIABLE>=<ABSOLUTE_PATH>


$ heroku config:set FIREFOX_BIN=/app/vendor/firefox/firefox

Setting FIREFOX_BIN and restarting python-app... done, v6
FIREFOX_BIN: '/app/vendor/firefox/firefox'

Implementation:

Python

app.py

import os
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary

def  load_driver():
    options = webdriver.FirefoxOptions()

    # enable trace level for debugging 
    options.log.level = "trace"

    options.add_argument("-remote-debugging-port=9224")
    options.add_argument("-headless")
    options.add_argument("-disable-gpu")
    options.add_argument("-no-sandbox")

    binary = FirefoxBinary(os.environ.get('FIREFOX_BIN'))

    firefox_driver = webdriver.Firefox(
        firefox_binary=binary,
        executable_path=os.environ.get('GECKODRIVER_PATH'),
        options=options)

    return firefox_driver

def  start():
    driver = load_driver()
    driver.get("https://www.google.com/")
    print(driver.title)
    driver.close()

if  __name__ == "__main__":
    start()

NOTE: Make sure to add a requirements.txt file with relevant packages, and add the following entry to your Procfile:

worker: python app.py

Ruby

app.rb

require  'selenium-webdriver'

options = Selenium::WebDriver::Firefox::Options.new

options.add_argument('--remote-debugging-port=9222')
options.add_argument('--headless')
options.add_argument('--disable-gpu')
options.add_argument('--no-sandbox')

Selenium::WebDriver::Firefox::Binary.path=ENV['FIREFOX_BIN']
Selenium::WebDriver::Firefox::Service.driver_path=ENV['GECKODRIVER_PATH']

# use argument `:debug` instead of `:info` for detailed logs in case of an error
Selenium::WebDriver.logger.level = :info 

driver = Selenium::WebDriver.for :firefox, options: options
driver.get "https://www.google.com"
puts  "#{driver.title}"
driver.quit

A minimalist Gemfile

source "https://rubygems.org"
gem 'selenium-webdriver', '3.142.7'

and Gemfile.lock file

GEM
  remote: https://rubygems.org/
  specs:
    childprocess (3.0.0)
    rubyzip (2.3.0)
    selenium-webdriver (3.142.7)
      childprocess (>= 0.5, < 4.0)
      rubyzip (>= 1.2.2)

PLATFORMS
  x64-mingw32

DEPENDENCIES
  selenium-webdriver (= 3.142.7)

BUNDLED WITH
   2.2.19

NOTE: Run heroku run bash from your local machine to test your script. Execute command ruby app.rb via bash to see result.

Node.js

app.js

const { Builder } = require('selenium-webdriver');
const { Options, firefox } = require('selenium-webdriver/firefox');

const  options = new  Options()
 .headless()
 .setBinary(`${process.env.FIREFOX_BIN}`);

(async  function  example() {
    let  driver = new  Builder()
     .forBrowser('firefox')
     .setFirefoxOptions(options)
     .build();

    try {
        await  driver.get('https://www.google.com/');
        console.log(await  driver.getTitle());
    }
    catch(err) {
        console.log(err.message);
    }
    finally {
        await  driver.quit();
    }
})();

NOTE: Make sure to add a package.json file with relevant packages, and add the following entry to your Procfile:

worker: node app.js

Go ahead and scrape the universe. Drop a star if you like this small project!

Buy me a Ko-fi, will ya?!