do-me / fast-instagram-scraper

A fast Instagram Scraper based on Torpy.
33 stars 7 forks source link

Downloading images #1

Closed kozeen closed 3 years ago

kozeen commented 3 years ago

Hi! Tried using this scraper after having issues with arc298's scraper, and I was wondering if it's possible to download not just the data, but also the images with it?

Edit: Perhaps it is possible to save only urls in the .csv file? Then it should be pretty easy to download

do-me commented 3 years ago

Hi @kozeen, the media URLs are already there in both .csv and .json. If you save as .csv the column is called "display_url". If you save as .json by default you will get the original metadata with media links in different resolutions as well.

As you say you could just iterate over the metadata and download all the media one by one but most likely it won't be "pretty easy" as you will get blocked if you scrape too fast or too much. However if I was you, I would apply the same logic here as Fast Instagram Scraper does: use one Tor end node to download as many pictures/videos/stories until you get blocked, then renew the Tor end node. Should be quite easy to implement with a few lines. In case you manage to come up with some working snippet feel free to share it here! Else I could give it a try in January.

do-me commented 3 years ago

Implemented image downloads. See latest changes!