ulixee / secret-agent

The web scraper that's nearly impossible to block - now called @ulixee/hero
https://secretagent.dev
MIT License
670 stars 44 forks source link

How do I execute JS in context of page? #118

Closed abhanan93 closed 3 years ago

abhanan93 commented 3 years ago

For my particular case, I want to execute a jQuery AJAX request inside the browser for my target website. I'm sorry if this is mentioned in the documentation but I couldn't find anything related to it via google search.

calebjclark commented 3 years ago

At the moment we don't allow executing JS in the remote website. Our reasoning is this...

We designed SecretAgent to emulate human users and thus avoid most bot blockers. Directly triggering javascript code in a webpage instead of "completing a user action" (i.e. clicking, typing, etc) is a sure fire way to tell the website's bot blockers that you are a bot.

Can the jQuery AJAX request you want to trigger be triggered through a users' mouse/keyboard event?

abhanan93 commented 3 years ago

Can the jQuery AJAX request you want to trigger be triggered through a users' mouse/keyboard event?

They can't be triggered through mouse/keyboard events according to my knowledge because the AJAX requests I want to make manually are normally made in the callback response of hCaptcha. I just want to skip past the actual captcha solving process and just make the further AJAX requests with the solved captcha token I get from the captcha solving service.

calebjclark commented 3 years ago

Ah, yes! Ok, I'm on the same page!

We need more support for handling captchas, which unfortunately we don't have right now. In the meantime...

Does your jQuery AJAX request require jQuery or can it be a standard Ajax/fetch request within the browser window?

abhanan93 commented 3 years ago

It's a standard AJAX POST request. I guess it can be made without jQuery too. It's just easier with jQuery and the website already has jQuery loaded.

calebjclark commented 3 years ago

Can you use the standard fetch? https://secretagent.dev/docs/basic-interfaces/tab#fetch

janisblaus commented 3 years ago

This is my first post here so I just wanted to say that your project is surprisingly advanced and I'm falling in love with it.

+1 for JS execution, same case here - I need to set captcha solution as the value of input field.

I would love to switch from puppeteer to secret-agent, but lack of possibility to control some of the things by evaluating code, it complicates things a lot.

is what I need could be done by hacking into injected scripts from /core/injected-scripts ?

blakebyrnes commented 3 years ago

Thanks @janisblaus! Yes, we are aware it is very much complicating converting from pupp/playwright.

We currently have a "getJsValue" method per frame that you can use to suit your purposes. It will take any javascript string. We're planning on changing this name to be more obvious and possibly allow it to run in an isolated execution context if you want it to. Longer term, we're going to add "setters" to awaited-dom, but it's a medium/large effort.

NOTE: that currently runs in the context of the webpage, so it can technically be scooped (ie, website hijacks querySelector function).

janisblaus commented 3 years ago

We currently have a "getJsValue"

Blake, you are awesome, I will jump into testing this right away.

blakebyrnes commented 3 years ago

Supported in new plugins