matthewmueller / x-ray

The next web scraper. See through the <html> noise.
MIT License
5.86k stars 350 forks source link

How to detect (non-200) HTTP status codes? #393

Closed kdekooter closed 1 year ago

kdekooter commented 1 year ago

Found a solution - inspired by https://github.com/jspri/request-x-ray. Using a custom driver that checks the http status code and throws an error for non-200s.

function makeDriver() {
  const request = Request.defaults({})

  return function driver(context, callback) {
    const url = context.url

    request(url, function(err, response, body) {
      // Throw error for non-200 status codes
      if (response && response.statusCode !== 200) {
        return callback(new Error('Status code ' + response.statusCode + ' for ' + url))
      }
      return callback(err, body)
    })
  }
}

const driver = makeDriver()

const xray = new Xray().driver(driver)