huaying / instagram-crawler

Get Instagram posts/profile/hashtag data without using Instagram API
MIT License
1.14k stars 283 forks source link

"Traceback" & "Failed to fetch the post" Bug happened when fetch_comments #66

Open RaviChan opened 4 years ago

RaviChan commented 4 years ago

result.json.zip (https://github.com/huaying/instagram-crawler/files/3681900/result.json.zip) Thanks very much for your work, and this API is definitely the best Instagram crawler I can found on the internet. But when I try the example, I have found an interesting bug.

I have installed the toolkit with "chromedriver_mac64.zip" (chromedriver 77.0.3865.40)

Step1: % python3 crawler.py posts_full -u elenalinnn --fetch_comments Step2: we can see that 19/79 is ok, but the 20th one will show the following in red:

Failed to fetch the post: https://www.instagram.com/p/BxxzM7ACyaX/Traceback (most recent call last): File "/Users/xingchen/Documents/Python/spider-Ins/instagram-crawler-master/inscrawler/crawler.py", line 217, in _get_posts_full fetch_comments(browser, dict_post) File "/Users/xingchen/Documents/Python/spider-Ins/instagram-crawler-master/inscrawler/fetch.py", line 145, in fetch_comments show_comment_btn.click() File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/selenium/webdriver/remote/webelement.py", line 80, in click self._execute(Command.CLICK_ELEMENT) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/selenium/webdriver/remote/webelement.py", line 633, in _execute return self._parent.execute(command, params) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute self.error_handler.check_response(response) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.ElementNotInteractableException: Message: element not interactable (Session info: headless chrome=77.0.3865.90)

After the running, we can get the 20th item as: {"key": "https://www.instagram.com/p/BrjMfkVFaSE/", "datetime": "2018-12-19T01:08:47.000Z"}, which is very different from others.

This bug will happen in item 20, 31, 48, 50, 55, 59, 60.

I can not find out what lead to this bug. Can you figure out this issue?

Thank you very much. Have a nice day.

Best Regards.

RaviChan commented 4 years ago

Anyone met this issue?

eren125 commented 4 years ago

DevTools listening on ws://127.0.0.1:59142/devtools/browser/5a32997f-1efb-4243-a270-6d1261dc5ec5 Traceback (most recent call last): File "crawler.py", line 83, in args.username, args.number, args.mode == "posts_full", args.debug File "crawler.py", line 28, in get_posts_by_user return ins_crawler.get_user_posts(username, number, detail) File "/mnt/c/Users/Emmanuel REN/Documents/instagram-crawler/inscrawler/crawler.py", line 142, in get_user_posts return self._get_posts_full(number) File "/mnt/c/Users/Emmanuel REN/Documents/instagram-crawler/inscrawler/crawler.py", line 193, in _get_posts_full ele_post.click() File "/home/eren/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webelement.py", line 80, in click self._execute(Command.CLICK_ELEMENT) File "/home/eren/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webelement.py", line 628, in _execute return self._parent.execute(command, params) File "/home/eren/.local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 312, in execute self.error_handler.check_response(response) File "/home/eren/.local/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.ElementClickInterceptedException: Message: element click intercepted: Element ... is not clickable at point (130, 110). Other element would receive the click: Instagram (Session info: headless chrome=77.0.3865.90)

eren125 commented 4 years ago

Hi guys, I have a similar problem, the API cannot click properly on a point and can't fetch the data for posts_full option. Is it because of a change on the instagram website or is it my computer ?

yuping-wu commented 4 years ago

Hi. Same error for me when I tried to get the number of likes for a post with a video.

I modified the function fetch_likes_plays in fetch.py file by changing browser.find_one(".QhbhU").click() at line 77 to element = browser.find_one(".QhbhU") browser.execute_script(element)

Also added def execute_script(self, t): self.driver.execute_script("arguments[0].click();", t) in browser.py file.

And it works for me. Maybe you can modify the code in a similar way. Hope it helps.