Below is what you can do with this program:
crawler.py
liker.py
This crawler could fail due to updates on instagram’s website. If you encounter any problems, please contact me.
./inscrawler/bin/chromedriver
pip3 install -r requirements.txt
cp inscrawler/secret.py.dist inscrawler/secret.py
inscrawler/secret.py
file.username
and password
variables' value to the ones corresponding to your Instagram account.
username = 'my_ig_username'
password = '***********'
positional arguments:
mode
options: [posts, posts_full, profile, hashtag]
optional arguments:
-n NUMBER, --number NUMBER
number of returned posts
-u USERNAME, --username USERNAME
instagram's username
-t TAG, --tag TAG instagram's tag name
-o OUTPUT, --output OUTPUT
output file name(json format)
--debug see how the program automates the browser
--fetch_comments fetch comments
# Turning on the flag might take forever to fetch data if there are too many commnets.
--fetch_likes_plays fetch like/play number
--fetch_likers fetch all likers
# Instagram might have rate limit for fetching likers. Turning on the flag might take forever to fetch data if there are too many likes.
--fetch_mentions fetch users who are mentioned in the caption/comments (startwith @)
--fetch_hashtags fetch hashtags in the caption/comments (startwith #)
--fetch_details fetch username and photo caption
# only available for "hashtag" search
python crawler.py posts_full -u cal_foodie -n 100 -o ./output
python crawler.py posts_full -u cal_foodie -n 10 --fetch_likers --fetch_likes_plays
python crawler.py posts_full -u cal_foodie -n 10 --fetch_comments
python crawler.py profile -u cal_foodie -o ./output
python crawler.py hashtag -t taiwan -o ./output
python crawler.py hashtag -t taiwan -o ./output --fetch_details
python crawler.py posts -u cal_foodie -n 100 -o ./output # deprecated
posts
, you will get url, caption, first photo for each post; choose mode posts_full
, you will get url, caption, all photos, time, comments, number of likes and views for each posts. Mode posts_full
will take way longer than mode posts
. [posts
is deprecated. For the recent posts, there is no quick way to get the post caption]-n
, --number
.-o
, --output
.The data format of posts
:
The data format of posts_full
:
Set up your username/password in secret.py
or set them as environment variables.
positional arguments:
tag
optional arguments:
-n NUMBER, --number NUMBER (default 1000)
number of posts to like
python liker.py foodie