matthewmueller / x-ray

The next web scraper. See through the <html> noise.
MIT License
5.88k stars 350 forks source link

x-ray crawler throws assertion exception on HTTP 999 #161

Closed gconnolly closed 8 years ago

gconnolly commented 8 years ago

Subject of the issue

Despite not being a valid HTTP Status, I have been coming across many HTTP 999 returns from pages/assets. x-ray does not handle a thrown assertion exception in this case.

Your environment

const Xray = require('x-ray')
const x = Xray()
const nock = require('nock')

const scope = nock('http://www.example.com/')
  .get('/')
  .reply(999)

const test = x('http://www.example.com/', 'title')

test((error, result) => {
  console.log(result)
  scope.done()
})

Expected behaviour

x-ray should handle the http status like any 400 or 500 level status

Actual behaviour

AssertionError: invalid status code: 999
gconnolly commented 8 years ago

This issue can be traced to http-context which is used by x-ray-crawler. I have submitted a pull request (https://github.com/lapwinglabs/http-context/pull/4) to remove two assertions that are the root cause and I believe to be problematic and unnecessary.

If the pull request were to be accepted, of course the version would have to be updated in x-ray-crawler and version of x-ray-crawler in x-ray

Kikobeats commented 8 years ago

Then @matthewmueller check PR please 😀

gconnolly commented 8 years ago

https://github.com/lapwinglabs/http-context/pull/4 Nice!