matthewmueller / x-ray

The next web scraper. See through the <html> noise.
MIT License
5.87k stars 349 forks source link

Bad request 400: Request Header Or Cookie Too Large #310

Closed bfelbo closed 5 years ago

bfelbo commented 6 years ago

Subject of the issue

I'm scraping various pages using x-ray using a simple script to get the title. However, I get an 400 Error Bad Request (or sometimes 413) from many different domains. These errors often come with the message Request Header Or Cookie Too Large or similar.

Your environment

Steps to reproduce

Run the following function on many links from the same website. I'm experiencing this with anything from small domains like www.theecologist.org to big domains like www.google.com.

const xray = require('x-ray')
let x = xray()

const getHeaderForLink = (linkUrl) => {
  try {
    x(linkUrl, 'title')(function(err, title) {
      if (err) {
        console.log(err)
      } else {
        console.log(title)
      }
    })
  } catch (err) {
    console.log(err)
  }
}

Expected behaviour

The script should return the title.

Actual behaviour

Instead of the title, the script returns error 400 and 413's.

lathropd commented 5 years ago

@bfelbo bfelbo I'm going to close this as stale since it sat for so long. Re-open if you still need help.