tzuhsial / InstagramCrawler

A non API python program to crawl public photos, posts or followers
https://github.com/iammrhelo/InstagramCrawler
MIT License
376 stars 106 forks source link

Crawler for "followers" stops after opening follower list #25

Open xtothez2k5 opened 6 years ago

xtothez2k5 commented 6 years ago

After calling this:

python instagramcrawler.py -q 'instagram' -t 'followers' -n 30 -a auth.json

the process works well till it opens the follower list. After opening it, it stucks. Nothing happens anymore. Weeks ago it worked well with autoscrolling down of follower list and getting all followers.

Does instagram changed something in their code? Is there a solution for it?

kyung-wook commented 6 years ago

I had a similar issue. How about check getting updated 'page_source' ? I fixed little after check that.

daniel-ed commented 6 years ago

What exactly has to be done to solve this issue? I am not able to fix it. Will there be an update, otherwise this app has to be flagged as not working.

kyung-wook commented 6 years ago

I fixed it little but its not perfect I think Instagram changed something.

daniel-ed commented 6 years ago

And could you tell us WHAT you changed?

kyung-wook commented 6 years ago

I changed it to parse the html (maybe '_driver.page_source' in code) everytime after scroll down. You find where the code stucks? my answer won't be the answer because i used it to scrape the image

lionhive commented 6 years ago

The problem is inside scrape_followers_or_following. I added some debug text and it does find the list of followers:

    List = title_ele.find_element_by_xpath(
        '..').find_element_by_tag_name('ul')
    print('found list', List.text)

==> generates:

('found list', u'peko11jun123\njun\nFollow\n_tural_255\nSherif\nFollow\njlidoctor\nJoseph Li\nFollow\ncamalmemmedov690\ncamal memmedov\nFollow\nsevgi.unvani\nSeVgi.Unvani \U0001f1e6\U0001f1ffAZE\nFollow\ntada0724\n\u30bf\u30c3\u30c9\nFollow\ntfvwbsevha\nBjSz2ID35HK2Y\nFollow\niwaishida\n\u5d0e\u5ca9\nFollow\nspchie137\nMorita.Chieko\nFollow\nismayil.62.840\nIsmayil Hesenov\nFollow')

However, the list is of length 1. It's being interpreted ad a single item. I've never used this framework so not familiar with how it's supposed to interpret lists.

    print('list len', len(List.find_elements_by_xpath('*')))

==>

('list len', 1)

xtothez2k5 commented 6 years ago

Are there any solutions for it right now? Doesn't find a solution for this...

xtothez2k5 commented 6 years ago
    # Loop through list till target number is reached
    num_of_shown_follow = len(List.find_elements_by_xpath('*'))
    print("Follower", num_of_shown_follow)

It's really the problem, that "num_of_shown_follow" is 1, even if there is a whole full follower list. What is the problem there?

xtothez2k5 commented 6 years ago

When I print out the exception in this snippet

try:
                element.send_keys(Keys.PAGE_DOWN)
            except Exception as e:
                print(e)

I get this exception Message:

Message: Element <li class="_6e4x5"> is not reachable by keyboard