elvisyjlin / media-scraper

Scrapes all photos and videos in a web page / Instagram / Twitter / Tumblr / Reddit / pixiv / TikTok
MIT License
371 stars 49 forks source link

TwitterScraper does not download all images and videos. #11

Open elvisyjlin opened 5 years ago

elvisyjlin commented 5 years ago

My testing account was always realDonaldTrump and mediascraper.twitter works fine on it. However, for other accounts, for instance, sigrid_ig, videos are not captured by mediascraper.twitter. The html structure seems to be different between accounts. It needs handling case by case.

archmord commented 5 years ago

Why dont you first capture all pic.twitter.com links from there download video and images

elvisyjlin commented 5 years ago

Sounds like you have a good idea. I will appreciate your work if you give it a try.

But I want to mention that Twitter video is not saved as a single link. First, you need to retrieve a m3u8 streaming video list. Then combining the videos of .ts together gives you a whole Twitter video.

archmord commented 5 years ago

You can use https://github.com/twintproject/twint to get all the pic.twitter links then get https://pbs.twimg.com/media/ from it and save