ikeboy / pluralsight-scraper

Pluralsight video downloader
https://www.knyz.org/blog/post/pluralsight-scraper-released/
GNU General Public License v2.0
136 stars 49 forks source link

Pluralsight block my account #22

Open mhelmi opened 4 years ago

mhelmi commented 4 years ago

image

mhelmi commented 4 years ago

I think you can add delay 1 or 2 min between videos download, no matter how big this delay I think we can live with that.

azhrzafar commented 4 years ago

My account also blocked due to high activity. I just increased the delay to 180(3 minutes) and it works for me.

mhelmi commented 4 years ago

How to increase the delay? @azhrzafar

Oghenebrume50 commented 4 years ago

@mhelmi in the index.js file look for this line await wait(30000 * index); increase the 30000 to any duration you like the default there is 30secs as indicated by the comment

Oghenebrume50 commented 4 years ago

Funny thing I increased mine to 3 minutes and I got blocked anyway, although it took longer than before, so now I am blocked for the second time and I do not know what to write to Pluralsight, I doubt they would open it up this time

azhrzafar commented 4 years ago

How to increase the delay?

@azhrzafar

Anyway, they still block you even you change delay to 5 minutes.

vezaynk commented 4 years ago

The good news is that there's still time to create new accounts. The bad news is that the current naive timeout strategy is definitely not working. I'm curious about how they detect it.

I guess we could try both.

Storager commented 4 years ago

Or a random download sequence.

freerider7777 commented 4 years ago

Also blocked...

freerider7777 commented 4 years ago

Seems they are watching for the presence of additional tracking requests

Oghenebrume50 commented 4 years ago

Yeah I think so, they could be watching for a lot of things, guys there is a way to download courses without stress now just use the pluralsight app downloader then decrypt using this https://github.com/mrvogiacu/Decrypt-PluralSight-Videos-GUI

siriokun commented 4 years ago

Try install puppeteer-extra

vezaynk commented 4 years ago

@siriokun Have you tested how much of an impact the stealth mode does?

siriokun commented 4 years ago

I have tested it and seems to working (with stealth mode & increasing delay to 2 minute)

vezaynk commented 4 years ago

Lovely. Would you like to send a PR with your changes?

siriokun commented 4 years ago

Sure #25

mhelmi commented 4 years ago

Can you take a look at this python script? it works perfectly. https://github.com/rojter-tech/pluradl.py

Suisse00 commented 4 years ago

As for the https://github.com/mrvogiacu/Decrypt-PluralSight-Videos-GUI be careful, since mid april Pluralsight break their own client (on Windows at least). They can't playback using their own client (uh uh...). They seem to use a new encryption class and I guess they forget to switch the playback decryption?

Otherwise just some feedback of someone messing on his side by itself and still has a valid account even if I downloaded a lot of course. (Overall I did download the equivalent of 300 courses metadata/videos (I had to download them multiple them)).

I even forget to put throttling a couple of time (thought I was processing one HTTP page at anytime, so it act like throttling).

BTW puppeteer-extra seem nice, thanks !

superdale007 commented 4 years ago

I think Pluralsight might be on to you.

freerider7777 commented 4 years ago

Decrypt is 404...

vezaynk commented 4 years ago

Pluralsight's legal team is upset with me allowing the posting of the links to the decrypter. They also wanted me to take down this project. So lets not link to it anymore, as it gets me unneeded attention.

I'm sure you'll be able to find it online if you look for it enough.

Bigemul commented 4 years ago

I may have a solution to avoid detection by Pluralsight: allow the use of already created cookies to connect. If your program works the same way like others do, they detect the fact that it doesn't solve captchas. When they see you trying to connect for the 4th time in a single day, they add a captcha and, if your program can't solve it (it could be done but really time consuming), they assume that you use a bot. They also have a list of disposable email adresses providers to prevent you from using some of them.

To avoid the captcha thing, a VPN service provider would be your first solution but the connection via cookies could definitely help you lay low.

By the way, if you want to run this in background (e.g. when sleeping), you need to increase the delay between the end of download n and the start of download n+1. A randomized delay between 20 and 40 minutes might do the trick.

I don't really know how your program works since I don't use this programs anymore since last month I got everything I wished (including courses that I paid on HB) but some instructors (hey Kate Gregory) are wiling to update their courses so it might come in handy.

I didn't open a new issue to help you lay low.