ad-m / python-anticaptcha

Client library for solve captchas with Anticaptcha.com support.
http://python-anticaptcha.readthedocs.io/en/latest/
MIT License
223 stars 51 forks source link

How to deal with task with no form submission and use callbacks #21

Closed kadnan closed 5 years ago

kadnan commented 6 years ago

Refer to this, how can I achieve this? The page I want it to solve does not show typical captcha to solve.

Thanks

ad-m commented 6 years ago

I am sorry, I am miss your inquire. I will look at it on the weekend and look for a solution!

Do you have any example website which use that in this way?

kadnan commented 6 years ago

It happens on Zillow.com, no page as such atm but it seems it uses this: https://developers.google.com/recaptcha/docs/display#explicit_render

ad-m commented 6 years ago

I don't see any important differences related to use explicit render. f you have explicitly render the reCAPTCHA widget you have to:

  1. sniff some way sitekey from website (lookup source code or analyse network traffik) - see https://anticaptcha.atlassian.net/wiki/spaces/API/pages/9666575/Reproducing+Recaptcha+validation+without+digging+the+HTML+source ,
  2. send sitekey using our library to Anti-captcha.com,
  3. send solution received from Ant-captcha.com to target website.

I am unable to reproduce captcha at Zillow.com, so I am unable to provide any snippet to help you. If you have any real website I can write snippet to integrate our library for you.

kadnan commented 6 years ago

The issue is, there is no hidden/visible form submission. It shows Captcha message and once it thinks it is not, it fades away once they think it is not. They are using PerimeterX service to avoid bots.

ad-m commented 6 years ago

@kadnan , i believe you can still simulate that using Selenium. See https://github.com/ad-m/python-anticaptcha/issues/19 for example integration.

kadnan commented 6 years ago

@ad-m

Are you referring this?

# Inject response in webpage
driver.execute_script('document.getElementById("g-recaptcha-response").innerHTML = "%s"' % response)

# Wait a moment to execute the script (just in case).
time.sleep(1)

# Press submit button
driver.find_element_by_xpath('//button[@type="submit" and @class="btn-std"]').click()
ad-m commented 6 years ago

@kadnan , yes, let's try to use it if you can.

kadnan commented 6 years ago

there is no Submit button hence driver.find_element_by_xpath('//button[@type="submit" and @class="btn-std"]').click() is not valid in my case. The issue is how to send solved response.

ad-m commented 6 years ago

@kadnan the solved respone should be sent in a standard manner in which the application expects it. If you have such a message in your browser - you can use Network Monitor to capture the correct request. Then you have to simulate them in the script.

kadnan commented 6 years ago

@ad-m

I finally found the html of the zillow page with Captcha. Here is the Dropbox link

https://www.dropbox.com/s/mo5cpitd3hl72qz/zillow_captcha.zip?dl=0

kadnan commented 6 years ago

@ad-m you there?

ad-m commented 6 years ago

@kadman, in your source code you should implement that part JavaScript code as Python code:

function handleCaptcha(response) {
    var vid = getQueryString("vid"); // returns part "vid" from query string of current page
    var uuid = getQueryString("uuid"); // return part "uuid" from query string of current page 
    var name = '_pxCaptcha';
    var cookieValue =  btoa(JSON.stringify({r:response,v:vid,u:uuid})); // create object using previous defined values, json.dumps then base64 encode)
    var cookieParts = [name, '=', cookieValue, '; path=/'];  // set previous value in cookie named "_pxCaptcha"
    cookieParts.push('; domain=' + window.location.hostname);
    cookieParts.push('; max-age=10');//expire after 10 seconds
    document.cookie = cookieParts.join('');
    var originalURL = getOriginalUrl("url");  // returns decoded param "name" from URL (it's relative path)
    var originalHost = window.location.host;
    var newHref = window.location.protocol + "//" + originalHost;
    originalURL = originalURL || '/';
    newHref = newHref + originalURL; // convert relative path to absolute 
    window.location.href = newHref; // Redirect user
}

More or less it mean that you should update your cookie & create request to new URL. That function was used in following part of code:

<div class="g-recaptcha" data-theme="white" data-callback="handleCaptcha" data-sitekey="6Lcj-R8TAAAAABs3FrRPuQhLMbp5QrHsHufzLf7b">

According to google documentation data-callback mean:

Optional. The name of your callback function, executed when the user submits a successful response. The g-recaptcha-response token is passed to your callback.
kadnan commented 6 years ago

OK i will check this out and get back to you.

BTW, did you check this?

https://webmasters.googleblog.com/2018/10/introducing-recaptcha-v3-new-way-to.html

Will it make impact in existing libary?

ad-m commented 6 years ago

@kadnan , at that moment it's out of scope of that library, because upstream Anti-captcha.com provider don't support it yet. See at https://anti-captcha.com/clients/help/noproblem/topic/268 at following question: obraz

ad-m commented 5 years ago

Closed as no response and provided solution.

ad-m commented 5 years ago

Latest releaese added support for Recaptcha V3: https://python-anticaptcha.readthedocs.io/en/latest/changes.html#id1