Cyberes / vitalsource2pdf

Ultra-high quality PDFs from VitalSource.
GNU Affero General Public License v3.0
69 stars 17 forks source link

Recaptcha block #1

Open mlcmc opened 1 year ago

mlcmc commented 1 year ago

reCaptcha verification box shows up after viewing some pages.

Maybe a chech if a reCaptcha div appeared could halt the page scrapping method to allow for user interaction with reCaptcha.

Other way would be to implement reCaptcha bypass, maybe in a way implemented by xHossein/PyPasser or teal33t/captcha_bypass

Cyberes commented 1 year ago

Wow, I never encountered that when I was developing this. Which is weird because I was spamming their book viewer for almost 3 days straight. Did one of their devs see this project and implement the captcha in response? That would be hilarious because the actions of one fucking idiot (me) caused someone at VitalSource to waste their time making their company even more anti-consumer. My classmates are gonna be like "I hate this new captcha" and I'll be giggling stupidly in the background.

I HAVE A MESSAGE FOR ALL THE SHITHEAD DEVELOPERS AT VITALSOURCE: GET A REAL JOB, LOSERS. (yes, I'm talking to you too, John)

Unfortunately I don't have the time to fix this right now. If I am forced to buy a book on VitalSource next semester I'll definitely fix it.

How frequent is the captcha?

mlcmc commented 1 year ago

@Cyberes sure... I'm not sure also if I can spend time on it right now, but it´d be interesting to implement this reCaptcha bypass. The verification box shows about 300 pages down scraping. I'll try to make it happen again to examine it.

mlcmc commented 1 year ago

@Cyberes image image

mlcmc commented 1 year ago

@Cyberes In order to quickly verify if we a recaptcha is required, the server sends back the HTTP code 428 (= recaptcha required).

mlcmc commented 1 year ago

@Cyberes error in getting image when reCAPTCHA required image

Cyberes commented 9 months ago

okay, I'll see if I have the time to finally crack this

ConAgent94 commented 9 months ago

Hey @Cyberes - had any time or luck yet?