subzeroid / instagrapi

🔥 The fastest and powerful Python library for Instagram Private API 2024
https://hikerapi.com/p/bkXQlaVe
MIT License
4.2k stars 667 forks source link

Solving Recaptcha challenge #1639

Open Altimis opened 10 months ago

Altimis commented 10 months ago

Is your feature request related to a problem? Please describe. Hi team. Thank you for this brilliant library. I'm facing an issue solving reCaptch after connecting and trying to get recent hashtag posts 5 or 6 times. Once I reach a certain iteration, I get the following error :

got exception for while loop : Traceback (most recent call last):
  File "/home/yassine/anaconda3/envs/smdashboard/lib/python3.8/site-packages/instagrapi/mixins/private.py", line 360, in _send_private_request
    response.raise_for_status()
  File "/home/yassine/anaconda3/envs/smdashboard/lib/python3.8/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://i.instagram.com/api/v1/tags/spotify/sections/

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/yassine/anaconda3/envs/smdashboard/lib/python3.8/site-packages/instagrapi/mixins/private.py", line 527, in private_request
    self._send_private_request(endpoint, **kwargs)
  File "/home/yassine/anaconda3/envs/smdashboard/lib/python3.8/site-packages/instagrapi/mixins/private.py", line 393, in _send_private_request
    raise ChallengeRequired(**last_json)
instagrapi.exceptions.ChallengeRequired: challenge_required

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/yassine/anaconda3/envs/smdashboard/lib/python3.8/site-packages/instagrapi/mixins/private.py", line 360, in _send_private_request
    response.raise_for_status()
  File "/home/yassine/anaconda3/envs/smdashboard/lib/python3.8/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://i.instagram.com/api/v1/tags/spotify/sections/

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "instagrapi_test.py", line 220, in get_all_user_ids
    while True:
  File "instagrapi_test.py", line 209, in get_user_ids
    return user_ids.union(get_user_ids(medias)), cursor
  File "/home/yassine/anaconda3/envs/smdashboard/lib/python3.8/site-packages/instagrapi/mixins/hashtag.py", line 273, in hashtag_medias_v1_chunk
    result = self.private_request(
  File "/home/yassine/anaconda3/envs/smdashboard/lib/python3.8/site-packages/instagrapi/mixins/private.py", line 546, in private_request
    return self._send_private_request(endpoint, **kwargs)
  File "/home/yassine/anaconda3/envs/smdashboard/lib/python3.8/site-packages/instagrapi/mixins/private.py", line 393, in _send_private_request
    raise ChallengeRequired(**last_json)
instagrapi.exceptions.ChallengeRequired: challenge_required

I connected to the account manually using the same Proxy and noticed that I had to confirm that I'm human (Help us confirm it's you) : image

I tried fixing the same device settings (user agent, device etc) and proxy for all the connections but it didn't work.

Describe the solution you'd like I noticed in the challenge.py :

elif challenge_type == "RecaptchaChallengeForm":
            """
            Example:
            {'message': '',
            'challenge': {
            'challengeType': 'RecaptchaChallengeForm',
            'errors': ['Неправильная Captcha. Попробуйте еще раз.'],
            'experiments': {},
            'extraData': None,
            'fields': {'g-recaptcha-response': 'None',
            'disable_num_days_remaining': -60,
            'sitekey': '6LebnxwUAAAAAGm3yH06pfqQtcMH0AYDwlsXnh-u'},
            'navigation': {'forward': '/challenge/32708972491/CE6QdsYZyB/',
            'replay': '/challenge/replay/32708972491/CE6QdsYZyB/',
            'dismiss': 'instagram://checkpoint/dismiss'},
            'privacyPolicyUrl': '/about/legal/privacy/',
            'type': 'CHALLENGE'},
            'status': 'fail'}
            """
            raise RecaptchaChallengeForm(". ".join(challenge.get("errors", [])))

I assume that this challenge has not been solved yet. IS there a way to solve this captcha automatically please ? Otherwise, how can avoid getting caught by this captcha, is there a trick ? Thank you

vltclz commented 8 months ago

Have a look into 2captcha service and analyse the POST request made when you submit the reCAPTCHA to IG, and you should be able to implement something that will make this work automatically 😉 The cons is that It won't be free but solving a reCAPTCHA is like 0.001$ so it's really ok if you limit your account to trigger that challenge a bit less often

oleksandrtur commented 6 months ago

I convert the token to a cookie, use the chrome driver to add the cookie and solve the captcha via 2captcha.

Altimis commented 6 months ago

Thank you @SashaTur @vltclz for your response. I would like to use 2captcha as a solution but I'm struggling to implement it within InstagrAPI. @SashaTur I see that your solution is independent from InstagrAPI. What is the "Token" that you are referring to and how to convert it to cookie please ? Thanks

Altimis commented 6 months ago

Here is the update of what I've accomplished until now regarding this subject :

challenge_url = https://www.instagram.com/api/v1/challenge/web/?next=https%3A%2F%2Fwww.instagram.com%2F%3F__coig_challenged%3D1 sitekey = 6Lc9qjcUAAAAADTfJq5kJMjN9aD1lxpRLMnCS2TG

Then I use 2captcha API to solve this challenge :

    solver = TwoCaptcha(api_key)
    try:
        result = solver.recaptcha(sitekey=sitekey, url=challenge_url)
        return result['code']
    except Exception as e:
        print(f"Error solving reCAPTCHA: {e}")
        return None

But I get stuck here after getting this response. I have no idea how to send it back to ig API (or maybe google API). Can anyone help please ? This step is so important in this project for scaling purpose. Thanks in advance.

vltclz commented 6 months ago

You should simply do a POST request like this I think :

requests.post(
    "https://www.instagram.com/api/v1/challenge/web/action/",
    data={
        "g-recaptcha-response": response["code"],
        "next": "https://www.instagram.com/?__coig_challenged=1",
    },
)
Altimis commented 6 months ago

@vltclz I tried it but it doesn't work... When I send the post request with the provided url and data (g-recaptcha-response and next) nothing happens. Normally if the recaptcha was solved and sent successfully to ig API then I shouldn't have a recaptcha challenge again when I connect to IG account ..

oleksandrtur commented 6 months ago

Thank you @SashaTur @vltclz for your response. I would like to use 2captcha as a solution but I'm struggling to implement it within InstagrAPI. @SashaTur I see that your solution is independent from InstagrAPI. What is the "Token" that you are referring to and how to convert it to cookie please ? Thanks

Auth data available in settings dump cookies = [
 {
 "domain": ".instagram.com",
 "expiry": time_s,
 "httpOnly": False,
 "name": "ds_user_id",
 "path": "/",
 "sameSite": "Lax",
 "secure": True,
 "value": account_json['authorization_data']['ds_user_id']
 },
 {
 "domain": ".instagram.com",
 "expiry": time_s,
 "httpOnly": True,
 "name": "sessionid",
 "path": "/",
 "sameSite": "Lax",
 "secure": True,
 "value": account_json['authorization_data']['sessionid']
 },
 {
 "domain": ".instagram.com",
 "expiry": time_s,
 "httpOnly": False,
 "name": "mid",
 "path": "/",
 "sameSite": "Lax",
 "secure": True,
 "value": account_json['mid']
 }
]

Solving with chrome driver

url = driver.find_elements(By.CSS_SELECTOR, "iframe[id='recaptcha-iframe']")[0].get_attribute("src")
driver.switch_to.frame(driver.find_element(By.CSS_SELECTOR, "iframe[id='recaptcha-iframe']"))

site_key = driver.find_elements(By.CLASS_NAME, "g-recaptcha")[0].get_attribute("data-sitekey")

token = get_captcha_solve(url, site_key) #2captcha integration
driver.execute_script(f"successCallback(\"{token}\");")

time.sleep(1)

driver.switch_to.default_content()

driver.find_elements(By.CSS_SELECTOR, "*[role=button]")[0].click()
brunobely commented 3 months ago

@oleksandrtur thanks for sharing. How do you get the url to point chrome driver to? And where do you get the time_s (which I'm assuming is a) timestamp from?