drawrowfly / tiktok-scraper

TikTok Scraper. Download video posts, collect user/trend/hashtag/music feed metadata, sign URL and etc.
4.45k stars 805 forks source link

missing records from getting scraped when input is given for 1K + records. #75

Closed itpothitech closed 4 years ago

itpothitech commented 4 years ago

Describe the bug Missing records from getting scraped when input is given for 1K + records. There are 20% - 30% of records not getting scraped and it's random in nature.

I want to scrape for some posts having 1M+ posts. e.g. : Song: https://www.tiktok.com/music/Dance-Monkey-6717552289336314629

If fields below will be empty then issue will be Ignored and Closed

To Reproduce Steps to reproduce the behavior:

Screenshots If applicable, add screenshots to help explain your problem. I have not captured one.

drawrowfly commented 4 years ago

Its mostly because of the errors from the TikTok side

Anyway i will take a look at it closely

itpothitech commented 4 years ago

Thanks for your response.

itpothitech commented 4 years ago

Sorry, I am adding one more requirement here. Please let me know if I need to create another issue.

I also need to know if I can get a link to the post. Now I am able to get the link for the video. But having the link to the post will be good. e.g. I got details about the below post. But not the link to it. I am looking for this: https://www.tiktok.com/@davidnelmes/video/6741778973954542853

Even if I will get the video id# 6741778973954542853, I can construct the link the post, I believe

drawrowfly commented 4 years ago

in latest version webVideoUrl will contain the original video url

itpothitech commented 4 years ago

Thanks a lot for the update. I will use it.

itpothitech commented 4 years ago

Sorry, I couldnt find the field "webVideoUrl" in version 1.1.5 e.g. Music ID# 6717552289336314629

drawrowfly commented 4 years ago

Oh right. Fixed it in the 1.1.6

itpothitech commented 4 years ago

Thanks a lot. I could see it now. !! :) Not sure if you got some time to check about the initial issue i.e. records missing while scraping more than 1K. I even tried using "asyncScraping" but no luck. Do you suggest anything?

drawrowfly commented 4 years ago

You can now set timeout to slow down requests and set proxies as array:

[
"user:password@127.0.0.1:8080",
"user:password@127.0.0.1:8081",
"127.0.0.1:8082",
"socks5://127.0.0.1:8083",
"socks4://127.0.0.1:8084"
]

This will allow you to download more videos

itpothitech commented 4 years ago

Thanks !! I will try that.