instagrambot / instapro

professional instagram tool for developers
https://instagrambot.github.io/
Apache License 2.0
91 stars 22 forks source link

Export only usernames into csv #4

Closed tayfunyasar closed 7 years ago

tayfunyasar commented 7 years ago

I'm trying to get a user's followers (its about 80k). its not finished yet. And it has been 4 days and still going..

How can we optimise that situation? Maybe save each 100 requests and skip saved usernames?

For workaround, how can I export only usernames into csv?

ohld commented 7 years ago

I have the code that downloads only usernames. Current task is heavy because it make a full info request for every user in 80k.

This are my code samples, you can integrate them to your notebook:

I have wrote special methods for your task

def dump_data(users, output_filename):
    try:
        data = pd.DataFrame(users).drop_duplicates()
    except:
        # means closed acc -> save empty file
        data = pd.DataFrame(columns=["username", "user_id"])
    mode = "w"
    data.to_csv(output_filename, mode=mode, sep="\t", header=(mode=="w"), index=False)

def save_followers_from_user(itarator, output_filename):
    get_username = lambda x: x[x.rfind("/") + 11:-4]
    users = []
    for _user in tqdm(iterator, desc=get_username(output_filename), leave=False):
        usr = {}
        usr["username"] = _user["username"]
        usr["user_id"] = _user["pk"]
        users.append(usr)
    dump_data(users, output_filename)

And the usage is abut that

iterator = get.user_followers(user_id, total=None)
save_followers_from_user(iterator, output_filename)

user_id is the id of user which followers you want to scrape