ulixee / hero

The web browser built for scraping
MIT License
824 stars 42 forks source link

Choosing a suitable browser engine for a userAgent #165

Open blakebyrnes opened 3 years ago

blakebyrnes commented 3 years ago

SecretAgent now has a capability for a user to provide a UserAgent that we match on to find either "any" agent that fulfills the request (~ chrome > 85) or is an exact match ('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.218 Safari/537.36`).

Current Behavior

Right now, SecretAgent will first translate the user agent to a suitable agent, and then looks up the browser engine. However, if that engine doesn't exist, SecretAgent throws an error saying to go install that browser.

Desired Behavior

Future Behavior

Eventually, we should show you warnings when you're running an engine mismatched with a user agent, but allow it to find the closest match. This will work on most sites...

blakebyrnes commented 1 year ago

This change will need to happen in Unblocked-web/unblocked/plugins/default-browser-emulator. There's code in there that chooses the appropriate engine for the given user agent. NOTE: we currently only "install" the user agents for the installed browsers. We might want to change that strategy as part of this.

GlenDC commented 1 year ago

Yes please. I so much want this.

Also slightly related, would be good if we can document a bit the UA selector language, as I do not think it is documented, is it?

blakebyrnes commented 1 year ago

We definitely need to document the selector language. I forgot that I noticed we have nothing about it as well. @calebjclark Any thoughts on where this should be in the docs?

We did add data files that go back a large number of Chrome versions in the last release, and newer Chrome versions are being automatically tested as they're added to Browserstack. You can now install an older chrome and run on it via the user agent selector (ie, install @ulixee/chrome-98-0). What we have not yet built is the ability to run Chrome 65 where we emulate headers, but it otherwise is going to look like Chrome 104. Or similarly, run Chrome 104 on Chrome 105, but accept that we don't have full emulation mappings to do so.