rbignon / doctoshotgun

Script to automatically book a vaccine slot on Doctolib in the next seven days.
GNU General Public License v3.0
549 stars 143 forks source link

woob.browser.exceptions.ClientError: 403 Client Error: Forbidden #296

Open GregAce opened 2 years ago

GregAce commented 2 years ago

I have this new error every time i start the script. Maybe an issue with 2FA ?

rbignon commented 2 years ago

Hi,

What is the content of ~/.local/share/doctoshotgun/state.json? (you can obfuscate cookies)

Try to remove the file and run doctoshotgun again.

GregAce commented 2 years ago

Hi, i removed the file, but it's not better.

state.json seems to be populated with the cookies:

{"cookies": "eJwlj1tT...bQvGTjTqWPoLk+9k6N/0dp+IJ2BsH+w5s/urajLmiomtcgbZ03KG0qS8Y6JpY15IfxEQIfTnokXnPX4wvjcEoa7i/gv1RYK+Pn8BcrBToI="}

rbignon commented 2 years ago

Can you run doctoshotgun with -d and paste all the output?

GregAce commented 2 years ago

here is the output :

2021-11-30 11:16:05,776:DEBUG:browser::browsers.py:1056:_load_cookies Reloaded cookies from storage 2021-11-30 11:16:05,786:DEBUG:urllib3.connectionpool::connectionpool.py:971:_new_conn Starting new HTTPS connection (1): www.doctolib.fr:443 2021-11-30 11:16:05,973:DEBUG:urllib3.connectionpool::connectionpool.py:452:_make_request https://www.doctolib.fr:443 "GET /sessions/new HTTP/1.1" 403 None 2021-11-30 11:16:06,021:INFO:browser::browsers.py:369:save_response Response saved to dc0ba7839d3146fea77171cef3631eaf 2021-11-30 11:16:06,023:DEBUG:browser::browsers.py:1095:dump_state Stored cookies into storage Traceback (most recent call last): File "G:\Users\Greg\Desktop\Script\doctoshotgun.py", line 908, in sys.exit(Application().main()) File "G:\Users\Greg\Desktop\Script\doctoshotgun.py", line 726, in main if not docto.do_login(args.code): File "G:\Users\Greg\Desktop\Script\doctoshotgun.py", line 278, in do_login self.open(self.BASEURL + '/sessions/new') File "G:\Users\Greg\AppData\Local\Programs\Python\Python39\lib\site-packages\woob\browser\browsers.py", line 898, in open return super(PagesBrowser, self).open(callback=internal_callback, *args, *kwargs) File "G:\Users\Greg\AppData\Local\Programs\Python\Python39\lib\site-packages\woob\browser\browsers.py", line 790, in open return super(DomainBrowser, self).open(req, args, **kwargs) File "G:\Users\Greg\AppData\Local\Programs\Python\Python39\lib\site-packages\woob\browser\browsers.py", line 531, in open response = self.session.send(preq, File "G:\Users\Greg\Desktop\Script\doctoshotgun.py", line 78, in send return callback(self, resp) File "G:\Users\Greg\AppData\Local\Programs\Python\Python39\lib\site-packages\woob\browser\browsers.py", line 527, in inner_callback self.raise_for_status(response) File "G:\Users\Greg\AppData\Local\Programs\Python\Python39\lib\site-packages\woob\browser\browsers.py", line 560, in raise_for_status raise ClientError(http_error_msg, response=response) woob.browser.exceptions.ClientError: 403 Client Error: Forbidden

GregAce commented 2 years ago

when connecting with a new browser on doctolib url, there is a puzzle captcha to access the site.

rbignon commented 2 years ago

wtf

samuelguesnier commented 2 years ago

I get the same error

├╴ Spécifiez votre situation: 1 │ 1 dose de rappel │ 2 complétion du schéma vaccinal (personnes immunodéprimées uniquement) ├╴ Spécifiez votre situation: 1 An unexpected exception of type ClientError occurred. Arguments: ('403 Client Error: Forbidden',)

samuelguesnier commented 2 years ago

It just worked, maybe the cookie was not valid anymore

GregAce commented 2 years ago

Changing my @ip with a VPN, captcha is no more requested, and the script is working again. So I guess a captcha protection is activated when doing to much requests to doctolib.

seranpion commented 2 years ago

Hello. Commenting even after issue closure.

I am running in the same issue, except that I don't have a VPN at hand. Removing the state.json cache file does not help.

Strangely, I can successfully log into the same account and use the doctolib web portal with a browser. I only ran into a captcha verification once (was both using the script and the browser).

I hope this is just a temporary ban, and that the cooldown is not too long.

rbignon commented 2 years ago

Unfortunately I can't reproduce. I guess once you resolved the captcha, a cookie is set to prove you are not a bot.

What kind of captcha is it? If possible the best thing would be to redirect you in the browser when it occurs to let you resolve the captcha by hand, and enter a callback uri or something like that in doctoshotgun.

GregAce commented 2 years ago

I still have the captcha when opening doctolib in a new browser (even in the phone app). The protection seems to be activated on my @ip. I agree with the workaround you propose. Here is the captcha : humain image

GregAce commented 2 years ago

Strangely, I can successfully log into the same account and use the doctolib web portal with a browser. I only ran into a captcha verification once (was both using the script and the browser).

if you open doctolib in a new private browsing window, the captcha will be required each time.

seranpion commented 2 years ago

I had the same captcha challenge.

if you open doctolib in a new private browsing window, the captcha will be required each time.

I suppose once an IP is suspicious, any connection without a "human" cookie is challenged. That would make sense.

If possible the best thing would be to redirect you in the browser when it occurs to let you resolve the captcha by hand, and enter a callback uri or something like that in doctoshotgun.

Another workaround would be to transfer whatever cookies attest the captcha-challenge success to the script store. But that's much more complicated.

seranpion commented 2 years ago

I just looked at the captcha wall, no apparent redirection. It's a frame on the portal, with captcha-clearing cookies at the end.

This feature seem specifically designed to prevent what this project is doing.

I analysed the challenge and found details by other people's experience here that it is going to be a tricky issue (emphasis mine):

The new method is POST to ?cf_chl_captcha_tk=GENERATED_TOKEN. It hands a cf_clearance cookie, allowing the user to bypass captcha, to the accepted device and, as usual, a __cfuid cookie stating the CloudFlare visitor id. The cf_clearance expires 1 day after the cookie was given and is valid for over 1k requests or until CloudFlare forces you to captcha again.

There do is a request at the end that yields a cf_clearance token ("cf" as CloudFlare of course), which could be retrieved by a tech-savvy user. It's on the POST /sessions/new?__cf_chl_captcha_tk__=<some_68_char_code> post-challenge request in browser.

I don't know if the 1k request limit is going to pose an issue. The script seem to make only one persistent connection per run, though.

@rbignon as you suggested, the script could do the following:

  1. Request the user to log into and out of the doctolib web portal in his browser (same IP) if not already done.
  2. Prompt the user for the cf_clearance cookie from his browser storage.
  3. Set that cookie on the session before attempting logins.

Maybe that issue should be re-opened until this is implemented?

GregAce commented 2 years ago

i reopen the issue

rbignon commented 2 years ago

In woob we support anticaptcha, I'll try to use it in doctoshotgun, but it requires to subscribe to the service.

However, what you suggest seems fine, but I don't get captcha here, can you try to do a PR?

seranpion commented 2 years ago

However, what you suggest seems fine, but I don't get captcha here, can you try to do a PR?

I got a captcha challenge by running the script while already logged and active in my browser. Maybe you can trigger it too.

But otherwise, I'll try to implement this soon and send a PR if someone does not beat me to it.

Side note: The conversation about CloudFlare challenges I linked to refers to a "Privacy Pass" feature.

Privacy Pass is a Chrome and Firefox browser extension that provides a better visitor experience for Cloudflare-protected websites. For instance, a visitor IP address with poor reputation may receive a Cloudflare captcha page before gaining access to a Cloudflare-protected website. After a single captcha page is solved, Privacy Pass generates tokens for use with Cloudflare websites to prevent frequent captcha. Privacy Pass generates 30 tokens for each solved captcha.

Privacy Pass allows a user to bypass CAPTCHAs.

I don't know if that can be of any use in that matter.

seranpion commented 2 years ago

OMG, I just took a look at that anti-captcha website… This screams of black-hat business. I'd recommend never relying on that kind of service.

[…] by using our service you are helping thousands of people to feed themselves and their families.

-__-

seranpion commented 2 years ago

It turns out the CF challenge bypass is much trickier than I hoped (more reseach).

I have commited some work on my fork branch. But as it is now it doesn't work.

I can reproduce the issue by using the tor network (quite some bad reputation IPs there). But sometimes the challenge would only appear in browser, sometimes only with the script. And recycling the cf_clearance token is not enough when it does (more cookies required?).

So I don't see any solution for the moment. Work around by using a proxy or wait to be un-blacklisted by CloudFlare.

seranpion commented 2 years ago

FYI @rbignon I have noticed one weird thing:

When using the https_proxy env variable, the login & 2FA codes seem not to go through the proxy. I noticed this when I left the variable set but the proxy was down. The auth worked but right after that I had a NewConnectionError.

I suppose it's a issue of its own, but I'm unsure of my analysis.

rbignon commented 2 years ago

Hm, we use the cloudscraper library, perhaps there is a link.

sly-net commented 2 years ago

Maybe Cloudflare performs TLS fingerprinting? There's a very interesting article about this: https://httptoolkit.tech/blog/tls-fingerprinting-node-js/