NoahCardoza / CaptchaHarvester

Solve captchas yourself without having to pay for services like 2captcha for use in automated projects.
https://pypi.org/project/captcha-harvester/
MIT License
658 stars 63 forks source link

H-captcha box not displaying #1

Closed shrishayap closed 4 years ago

shrishayap commented 4 years ago

When I try running your script, I don't get the h-captcha box itself running (I implemented your h-captcha html script where it says self.htmlcode.

NoahCardoza commented 4 years ago

Implemented it how? In your own script? Which domain were you on? Could you provide an example or more detail?

shrishayap commented 4 years ago

@NoahCardoza I took your script, and edited it a little. I removed the sign into google feature in the harvester.py file. I also changed the main.py script. I changed it so that I can execute it all via pycharm. The code is below:

from harvester import Harvest, load_html_template

def getTokens():
    html_template = load_html_template(
        'hcaptcha', '33f96e6a-38cd-421b-bb68-7806e1764460', 'localhost:5000')

    s = Harvest('http://www.sneakersnstuff.com', html_template)

    while True:
        s.solve()
def startServer():
    server.start('5000')

if __name__ == '__main__':
    getTokens()
    startServer()

I am only interested in getting into SNS right now. When I run main.py, I can't get the h-captcha box to display. Can you please help?

ANOTHER IMPORTANT THING. WHEN I RUN THE SCRIPT ON A VPN, I AM ABLE TO PULL UP THE H-CAPTCHA BOX, BUT AGAIN, I NEED TO HAVE A VPN ON. WHAT IS THE REASON FOR THIS?

NoahCardoza commented 4 years ago

Sorry, I just started school today so I have been a little busy. Interestingly enough I just finished an SNS bot, that's why I was using cloudscraper to begin with.

I wasn't able to get the sign in to Google to work anyways, I'm pretty sure they were blocking it because they detected Selenium, but it was present in the old repo so I let it be. I'm thinking of possibly writing a Chrome extension to provide the form injection on a plain version of Chrome.

As to your problem though:

I now understand while the original author wrapped app.run in server.start in a Thread. That should probably be removed and done outside by whoever in implementing the module.

For one, the server will never start, because getTokens will run infinitely with that while loop. As to why the form isn't showing up, I'm not sure. When I wrote it, I tested it on SNS. Could you show me some screenshots?

Now about that last bit, are you saying that the hcaptcha only shows up when you are running a VPN?

shrishayap commented 4 years ago

@NoahCardoza, here is what it looks like: Screen Shot 2020-04-14 at 10 26 30 AM

This is the HTML code by the way

<html>
<meta name='viewport' content='width=device-width, initial-scale=1, maximum-scale=1, user-scalable=no'>

<head>
    <script charset="utf8" src="https://hcaptcha.com/1/api.js"></script>
    <script src='http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js' type='text/javascript'></script>
    <title>Captcha Harvester</title>
    <style type='text/css'>
        body {
            margin: 1em 5em 0 5em;
            font-family: sans-serif;
        }

        fieldset {
            display: inline;
            padding: 1em;
        }

        #submit {
            color: #ffffff;
            background-color: #3c3c3c;
            border-color: #3c3c3c;
            display: inline-block;
            margin-bottom: 0;
            font-weight: normal;
            text-align: center;
            vertical-align: middle;
            -ms-touch-action: manipulation;
            touch-action: manipulation;
            cursor: pointer;
            background-image: none;
            border: 1px solid transparent;
            white-space: nowrap;
            padding: 8px 12px;
            font-size: 15px;
            line-height: 1.4;
            border-radius: 0;
            -webkit-user-select: none;
            -moz-user-select: none;
            -ms-user-select: none;
            user-select: none;
        }
    </style>
</head>

<body>
    <center>
        <h3>Captcha Token Harvester</h3>
        <form action='http://serveriphere:5000/solve' method='post'>
            <fieldset>
                <div class='g-recaptcha' data-sitekey='4sitekey4' data-callback='sub'></div>
                <p>
                    <input type='submit' value='Submit' id='submit'>
                </p>
            </fieldset>
        </form>
        <fieldset>
            <h5 style='width: 10vh;'> <a style='text-decoration: none;' href='http://serveriphere/json'
                    target='_blank'>Usable Tokens</a> </h5>
        </fieldset>
    </center>
    <script>function sub() { document.getElementById('submit').click(); }</script>
</body>

</html>

I made the sitekey 4sitekey4 and changed the .replace() function, so that is not the issue. I also did the same w/ server ip (made it serveriphere and updated the .replace())

NoahCardoza commented 4 years ago

Interesting. I think they much have changed something since I originally updated the repo.

The problem seems to be with loading the JS when using document.write if I load the HTML file straight from a browser it loads the hCaptch, of course, it won't work because the domain is incorrect though.

It's probably not too complicated of a fix but with school just started I need to make sure I'm on top of that before looking into this. I'll keep you posted on what I find.

NoahCardoza commented 4 years ago

After further inspection, I think I found a simple workaround. I'll try to get in implemented into the repo ASAP.

shrishayap commented 4 years ago

@NoahCardoza Thanks man, I'm waiting for the update!

NoahCardoza commented 4 years ago

I looked into it a bit further and found it actually works better and also works for Google's ReCaptchas. At first, it involved replacing the instances of window.location.host in hCaptcha's JS, but I wanted to try it with Google's ReCaptcha as well, which was a little harder, and somehow they wouldn't let me load the captcha, something about an error with the internet connection.

Since window.location can't be spoofed in the browser, the next best option is to trick the browser into thinking it is somewhere it isn't.

I ran python3 -m http.server 80 in /harvester/html and edited my hosts file:

127.0.0.1 sneakersnstuff.com

Then when you navigate to http://sneakersnstuff.com/hcaptcha.html you see: image (I edited the HTML a little bit)

The only problem now, is I assume you want to connect to sneakersnstuff.com with your bot as you solve the captchas... So, I'm looking into setting up a custom DNS server and pointing Chrome to it so that when using Chrome you'll get the harvester, but everything else will point to the real sneakersnstuff.com site. I'm assuming Cloudflare will get a little sus if you try to access it by it's IP.

I'll try to work on this later tonight/tomorrow.

NoahCardoza commented 4 years ago

Ok, so I just pushed some updates to a new branch.

Normally I would have cleaned things up a bit, but what I have got so far might be helpful to you.

Basically, move into that branch, install the new dependencies, then run server.py:

pipenv install
cd harvester
sudo python server.py

I've hardcoded quite a bit but it should work for you.

Once you start the server, got to http://localhost. Also, make sure you set 127.0.0.1 as a DNS server in your machine's network configuration.

Then fill in the domain of the website and the sitekey: image

This will send you here http://www.sneakersnstuff.com/harvest?type=hcaptcha&sitekey=33f96e6a-38cd-421b-bb68-7806e1764460

image

Where you will be able to solve your captchas.

NOTE: The only problem right now, is that everything, even your scripts, will be pointed to your computer when trying to access www.sneakersnstuff.com. Not all sites, but any site protected by Cloudflare won't let you access the host by IP.

One workaround could be running this in a VM.

Another more specific to python:

import socket

dns_cache = {}
# Capture a dict of hostname and their IPs to override with
def override_dns(domain, ip):
    dns_cache[domain] = ip

prv_getaddrinfo = socket.getaddrinfo
# Override default socket.getaddrinfo() and pass ip instead of host
# if override is detected
def new_getaddrinfo(*args):
    if args[0] in dns_cache:
        print("Forcing FQDN: {} to IP: {}".format(args[0], dns_cache[args[0]]))
        return prv_getaddrinfo(dns_cache[args[0]], *args[1:])
    else:
        return prv_getaddrinfo(*args)

socket.getaddrinfo = new_getaddrinfo

Before running the harvest server, you can get the IP by running:

> nslookup www.sneakersnstuff.com
Server:     1.1.1.1
Address:    1.1.1.1#53

Non-authoritative answer:
Name:   www.sneakersnstuff.com
Address: 104.18.127.12
Name:   www.sneakersnstuff.com
Address: 104.18.128.12

Then, through that snippet above into the top of your script using the domain and adding:

override_dns('www.sneakersnstuff.com', '104.18.127.12')

This hack works by monkey patching the low-level socket.getaddrinfo function so instead of actually calling the DNS server, it will return the IP we stored in the dns_cache with override_dns.

I hope this helps. If you have any questions feel free to ask.

You can also ping me on Discord: MacHacker#7322

Good luck!

shrishayap commented 4 years ago

Hey Noah,

When I run your new code, there is an error that comes up in these lines:

dns_resolver = InterceptResolver('1.1.1.1', 53, '60s', [], [], [], 5)
logger = DNSLogger('-request,-reply,-truncated,-error,-recv,-send,-data')
DNSServer(dns_resolver, port=53, address='127.0.0.1', logger=logger, tcp=False).start_thread()
DNSServer(dns_resolver, port=53, address='127.0.0.1', logger=logger, tcp=True).start_thread()

Specifically, on the third line. I get a "PermissionError: [Errno 13] Permission denied" error. Can you please help with this. I will try to go through the documentation myself.

NoahCardoza commented 4 years ago

Since it needs to access ports 53 and 80, make sure to run it as I showed in the example, with sudo: sudo python server.py

NoahCardoza commented 4 years ago

fixed with https://github.com/NoahCardoza/CaptchaHarvester/pull/3