drawrowfly / tiktok-scraper

TikTok Scraper. Download video posts, collect user/trend/hashtag/music feed metadata, sign URL and etc.
4.45k stars 805 forks source link

Question and Request: Setting cursor location for music request #345

Closed adrennhoff closed 4 years ago

adrennhoff commented 4 years ago

Hi, first of all, thanks for all the work on this. This is sort of a request and a question in one.

I'm interested in grabbing the posts for some popular songs/music. Some of these songs have tens of millions of videos. Suppose I type the following command:

tiktok-scraper music MUSICID -n 25000 -t csv

or

tiktok-scraper music MUSICID -n 10000 -t csv

In both cases, I get somewhere around 4000-5000 results. Clearly I am getting rate limited. What I do not know is how to solve this problem. For example, if I could get 1-4000 with one IP address and then get 4001-8000 with a second IP address, etc. Something like that could be done in a batch request.

But I don't see a way to set the starting value. I've played around with the website's code myself and I see that the maxCursor variable seems to control where you start.

Can anyone suggest a way that I can get the 25,000 results I want? I'm happy to use proxies but I'm not sure how to make that work within the confines of the command line.

Any suggestions would be helpful!!! Thanks,

adrennhoff commented 4 years ago

Just as a follow-up, setting the --timeout to a large number makes the request run very slowly (as expected) but does not actually result in getting more results. So clearly the --timeout is not really impacting the rate limiting.

drawrowfly commented 4 years ago

Regarding proxy usage

https://github.com/drawrowfly/tiktok-scraper/blob/master/examples/CLI/BatchDownload.md here you can find explanation on how to use --proxy-file option that can point to a file with proxies

Regarding 4k as far as i know this was the limit from tiktok, but you can try proxies