QIN2DIM / hcaptcha-challenger

🥂 Gracefully face hCaptcha challenge with MoE(ONNX) embedded solution.
https://docs.captchax.top/
GNU General Public License v3.0
1.45k stars 258 forks source link

url 'https://api.hcaptcha.com/getcaptcha/' returning base64 instead json #976

Open allerallegro opened 4 months ago

allerallegro commented 4 months ago

The return of this url is returning a base64 instead of a json, generating an error in the control.py file, line 144.

    async def handler(self, response: Response):
        if response.url.startswith("https://api.hcaptcha.com/getcaptcha/"):
            try:
                data = await response.json()
'Exception has occurred: UnicodeDecodeError
'utf-8' codec can't decode byte 0xf7 in position 3: invalid start byte'
12189108 commented 4 months ago

It doesn't seem to be base64, but another encryption method. Or can you provide me with a demo to decrypt the data?

allerallegro commented 4 months ago

https://accounts.hcaptcha.com/demo?sitekey=c86d730b-300a-444c-a8c5-5312e7a93628 In this link, when you click on the checkbox, the website triggers a request (https://api.hcaptcha.com/getcaptcha/c86d730b-300a-444c-a8c5-5312e7a93628) that returns as an octet-stream. data.txt

william9x commented 4 months ago

I'm having the same problem. It also happen when you refresh challenge. But it will return decoded data when you re-click the checkbox again.

QIN2DIM commented 4 months ago

@william9x interesting. I've been working on LLM Agent application stuff for the last couple months.

I haven't allocated too much effort to open source projects, I'll take a look in a couple days.

I expected to introduce YOLOV9 and LLM to handle multi-mode challenges. Strive to kill the game.

Mouad-scriptz commented 4 months ago

Simple

const response = new ArrayBuffer()
const responseText = new TextDecoder().decode(response);
const data = JSON.parse(responseText);
12189108 commented 3 months ago

Simple

const response = new ArrayBuffer()
const responseText = new TextDecoder().decode(response);
const data = JSON.parse(responseText);

Hello, it seems that the code you provided cannot solve the problem, as it still produces garbled output. Could you please provide a complete demo?

Mouad-scriptz commented 3 months ago

Simple

const response = new ArrayBuffer()
const responseText = new TextDecoder().decode(response);
const data = JSON.parse(responseText);

Hello, it seems that the code you provided cannot solve the problem, as it still produces garbled output. Could you please provide a complete demo?

I will, when I have time (probably this week or the next). Update: this seems to be affecting the collector too, https://github.com/QIN2DIM/hcaptcha-challenger/actions/runs/8318149616/job/22759735996

BrynGibson commented 1 month ago

I have managed to bypass octet stream issues by changing the removing application/octet-stream from accept headers of requests to https://api.hcaptcha.com/getcaptcha/.

I have done this with playwright:


page = await context.new_page()

async def handle_route(route, request):
    headers = request.headers.copy()
    accept_header = headers.get('accept', '')
    if 'application/octet-stream' in accept_header:
        # Remove application/octet-stream from headers
        new_accept_header = ','.join(
            [part for part in accept_header.split(',') if part.strip() != 'application/octet-stream']
        )
        headers['accept'] = new_accept_header
        await route.continue_(headers=headers)
    else:
        await route.continue_()

await page.route('https://api.hcaptcha.com/getcaptcha/*', handle_route)

#await stealth_sync(page)
agent = prelude(page)

Agent has still failed on every task attempted so far however, but that is a different issue.