Cuadrix / puppeteer-page-proxy

Additional module to use with 'puppeteer' for setting proxies per page basis.
425 stars 99 forks source link

TypeError: Cannot read property 'url' of null #36

Closed pkjy closed 1 year ago

pkjy commented 4 years ago

image

And the position is in cookies.js line 69 (maybe) image

code like

this.url = request.isNavigationRequest() ? request.url() : request.frame().url();

when I change to this it didn't throw any error

this.url = request.url() 

so what's request.isNavigationRequest() for and whether it's necessary?

Cuadrix commented 4 years ago

The point is to get the correct cookies from the browser. That is done with the url of the frame, not the request url. Network.getCookies method is used to get all cookies for the current page for each request. The url that is passed as an argument to that method is the url of the page e.g.: retrievecookies

During navigation to a new page from f.ex. about:blank the url is not valid because one cannot retrieve cookies for an empty page, so it will instead retrieve the url from the navigation request and use that to retrieve all the cookies of that page since navigation to a page usually sets the url of the frame.

Or atleast that was my logic ^ If it's not necessary, then it means that CDP already parses the url and sets cookies accordingly. I'll do a few tests to see if that's the case, and remove that conditional if it's unnecessary.

pkjy commented 4 years ago

The point is to get the correct cookies from the browser. That is done with the url of the frame, not the request url. Network.getCookies method is used to get all cookies for the current page for each request. The url that is passed as an argument to that method is the url of the page e.g.: retrievecookies

During navigation to a new page from f.ex. about:blank the url is not valid because one cannot retrieve cookies for an empty page, so it will instead retrieve the url from the navigation request and use that to retrieve all the cookies of that page since navigation to a page usually sets the url of the frame.

Or atleast that was my logic ^ If it's not necessary, then it means that CDP already parses the url and sets cookies accordingly. I'll do a few tests to see if that's the case, and remove that conditional if it's unnecessary.

ok, Using request.url() works fine in my project for now. Look forward to your test results. BTW, I'm using Puppeteer v4.0.0.

pkjy commented 4 years ago

here's the code for test.

// test.js

const puppeteer = require('puppeteer');
const useProxy = require('puppeteer-page-proxy');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  await page.setRequestInterception(true);
  page.on('request', async request => {
    await useProxy(request, 'http://116.62.177.112:22989'); // change your proxy here
  });

  await page.goto('https://m.ctrip.com/webapp/you/')
})();

then add some logs in source code

// node_modules/puppeteer-page-proxy/src/lib/cookies.js

class CookieHandler extends CDP {
  constructor(request) {
    super(request._client);
    // ---------------------------here-----------------------------
    if (!request.frame()) { 
      console.log('wrong links', request.url())
    }
   // ---------------------------here-----------------------------
    this.url = request.isNavigationRequest() ? request.url() : request.frame().url();
    this.domain = new URL(this.url).hostname;
  }
  ......

image

here's results.

$ node test.js
wrong links https://webresource.c-ctrip.com/resaresonline/fx/lizard22ares/latest/default/web/lizard.seed.js
wrong links https://sec-m.ctrip.com/restapi/soa2/10290/createclientid?systemcode=09&createtype=3&contentType=json
wrong links https://webresource.c-ctrip.com/ResADVOnline/R2/dist/sales/lasttime.v2.0.js
wrong links https://sec-m.ctrip.com/restapi/soa2/10245/GetDynamicAd.json?_rm=0.7647419562660094
wrong links https://sec-m.ctrip.com/restapi/soa2/10245/GetGlobalADListV4.json?_rm=0.4979459775699253
tarik0 commented 3 years ago

For me the workaround was : this.url = (request.isNavigationRequest() || request.frame() == null) ? request.url() : request.frame().url();

Cuadrix commented 1 year ago

fixed in v1.2.9