ulixee / hero

The web browser built for scraping
MIT License
696 stars 33 forks source link

How to fake a URL/window.location on page load? #162

Open bratao opened 2 years ago

bratao commented 2 years ago

Hello, First of all, thanks for this great project!

I have a local page with a javascript code (Rcaptcha v3). I want to load it and pretend that I´m on a remote URL (target site). How I would do this with secret-agent? On Puppeteer/playwright I can intercept a Request and return my own content, but I did not found a way to do it using secret-agent.

blakebyrnes commented 2 years ago

Thanks! We have this feature in Core (ie, the backend), but I'm not sure we've exposed it to the client yet.

bratao commented 2 years ago

Awesome @blakebyrnes . In the meantime can you please point how to do with core? I'm communicating directly with the core using the Websocket protocol. I plan to release a python client soon.

blakebyrnes commented 2 years ago

Wow. Awesome!!

Here's an example of intercepting using the internals: https://github.com/ulixee/secret-agent/blob/efdf434044109e3f25ab6283768bd47ff4bae1d9/core/test/user-profile.test.ts#L312

What you're doing is awesome, but please be aware the internals are not considered "stable apis" by our team, so we will change those without an appropriate semver on occasion. We're also underway on SecretAgent 2.0 (https://github.com/ulixee/hero) which will have some changes as well. As long as you're ok chasing a moving target, no worries :)

blakebyrnes commented 2 years ago

This api isn't actually in use at the moment, but we intend to eventually have an api like so to have a referer for your first "goto" without having to load up the refering url.

https://github.com/ulixee/secret-agent/blob/efdf434044109e3f25ab6283768bd47ff4bae1d9/core/lib/Tab.ts#L269

I like your use case of providing start html!

blakebyrnes commented 2 years ago

I would accept a PR that exposes setOrigin and adds an "html" parameter

bratao commented 2 years ago

Great @blakebyrnes , I understand that it is a moving target. Great to know that hero will be the next version. I will try to use it.

I have zero experience with node/ts/yarn. I will try to code this PR to expose setOrigin, it looks easy to do, but I need to grasp how the dev env works before.

blakebyrnes commented 1 year ago

NOTE: this feature exists in Agent, but we might need to change it to use a temporary network interceptor instead of the MITM so that it works when the MITM is disabled.