chrome-php / chrome

Instrument headless chrome/chromium instances from PHP
MIT License
2.2k stars 269 forks source link

ADD: headless=new option and compatibillaty #627

Closed mrmmg closed 4 weeks ago

mrmmg commented 4 weeks ago

Recently, I used this package to scrape a website that requires users to log in.

My target website uses cookies, IndexedDB, and SessionStorage, and sometimes, even with a logged-in browser with user-data-directory, I still see the login form. In the DevTools of headless chrome , there are many errors indicating that Chrome cannot access cookie storage or IndexedDB. After this, I tried crawling with Node.js and the Puppeteer library, but the issue remained. I then started browsing the Puppeteer issues and found some similar to mine:

puppeteer/puppeteer#12498

puppeteer/puppeteer#1316

puppeteer/puppeteer#1268

puppeteer/puppeteer#1270

puppeteer/puppeteer#5612

Chrome Developers Docs About headless=new

In the first link, a user said that by setting the Puppeteer launch option to headless: 'new', the issue was resolved. I did the same and set the headless mode to 'new', which solved my problems. However, when I wanted to try the new headless mode with the chrome-php/chrome package, I realized that the library does not support this feature. So, I created this pull request and made the necessary changes.

About Tests I wrote only two tests, as I believe that is enough to test the new code. However, if you think there are any missing points, please let me know, and I will add more tests.

About Documentation I am waiting for this pull request to be merged into the main code, after which I will add proper documentation for the new headless mode feature.