elvisyjlin / media-scraper

Scrapes all photos and videos in a web page / Instagram / Twitter / Tumblr / Reddit / pixiv / TikTok
MIT License
371 stars 49 forks source link

Twitter media download in subfolders #8

Closed archmord closed 5 years ago

archmord commented 5 years ago

Hi, I tried downloading twitter images but it saves in a subfolder of the current folder of the image like this (media-scraper\download\twitter\username\username\username\username) Is there any way to fix this and just download all images in one folder thanks

elvisyjlin commented 5 years ago

Hi, (media-scraper\download\twitter\username\username\username\username) is a bug. I've revise mediascrapers to save images in correct paths (media-scraper\download\twitter\username). Please git pull and try again. If you'd like to save all images to the same folder, such as ./my_twitter_images, please do the following modifications:

  1. Change this line to scraper.download(tasks=tasks, path='CUSTOM_PATH'), where CUSTOM_PATH is your desired path. E.g. my_twitter_images.
  2. Replace username with None in these lines (1, 2, 3, 4).
archmord commented 5 years ago

I did git pull again and now it saves in one folder But it does not download videos Any idea why?

elvisyjlin commented 5 years ago

Hi, I tried an account and I got both images and videos. Can you provide me a Twitter account to reproduce the problem?

By the way, videos are mostly in an extension of .ts.

archmord commented 5 years ago

Does it download videos in the same folder as for image or different folder? twitter account that I tried but only got images @sigrid_ig BTW is it possible to put pic.twitter link as the name of image downloaded so It is easy for me to look at post
I tried on another account which has around 1390 photos and videos but it only downloaded 43 pics

elvisyjlin commented 5 years ago

Please see scrape() in TwitterScraper here. We first scroll to get the whole page, retrieve images and videos, and save them as tasks. The format of tasks is: [(url, subfolder, filename), ...]. So you can set the subfolder depending on the type of media. Also, you can name the downloaded file by parsing the link.

Thank you for providing me the account. I observed Twitter web page behave differently between accounts. For example, my testing account is realDonaldTrump and I get both images and videos from this account. However, there are something different in other accounts. I need time to figure it out.

archmord commented 5 years ago

I dont know why it's behaving like this As you said for realDonaldTrump downloading correctly but not for sigrid_ig

elvisyjlin commented 5 years ago

I'll open a new issue for this problem.