puppeteer / puppeteer

JavaScript API for Chrome and Firefox
https://pptr.dev
Apache License 2.0
88.69k stars 9.07k forks source link

[Bug]: Intercepting the request cannot find the host in the headers #12789

Open chenxicore opened 3 months ago

chenxicore commented 3 months ago

Minimal, reproducible example

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({ headless: false, devtools: true });
  const page = await browser.newPage();

  // 监听请求事件
  page.on('request', request => {
    const headers = request.headers();
    console.log(headers)
    if (headers['host']) {
      console.log('Request URL:', request.url());
      console.log('Request Headers:', headers);
    } else {
      console.log('Host header not found in the request to:', request.url());
    }
  });

  // 导航到一个页面
  await page.goto('http://172.18.253.134/');

  await browser.close();
})();

Background

When I intercept the request, I cannot find the host in the headers, but I can see it after devtools is opened.

80a38087-656d-4ee3-ac38-ef1af62a63d4 963b8a50-53fb-4b7a-b79f-efbb9fb0d768

Expectation

I hope that host and other information can be written in the headers when intercepting the request.

Reality

Intercepting the request cannot find the host in the headers

Puppeteer configuration file (if used)

No response

Puppeteer version

22.8.1

Node version

v20.13.1

Package manager

npm

Package manager version

10.5.2

Operating system

Windows

OrKoN commented 3 months ago

Host header is in the requestWillBeSentExtraInfo event and Puppeteer does not extract it.

leohanon commented 2 months ago

As far as I understand, requestWillBeSentExtraInfo is an additional event that is not available on all requests... so we can't exactly wait around until receiving requestWillBeSentExtraInfo to send the request details. I suspect what we could do is add an additional PageEvent of 'requestWithExtraInfo' that forwards the extra info... and then let the user marry the two on their own?

OrKoN commented 2 months ago

requestWillBeSentExtraInfo is emitted if the request actually reaches the network service and some headers can only be known at that stage (due to security concerns and probably due to availability of the data, e.g., host is probably only resolved once the request is really going to the wire).