shaikhsajid1111 / facebook_page_scraper

Scrapes facebook's pages front end with no limitations & provides a feature to turn data into structured JSON or CSV
https://pypi.org/project/facebook-page-scraper/
MIT License
210 stars 62 forks source link

List index out of range #42

Open CodedNil opened 1 year ago

CodedNil commented 1 year ago

I have followed the github documenation and nothing more to get posts, and I am encountering this error:

File "\facebook_page_scraper-4.0.1-py3.11.egg\facebook_page_scraper\element_finder.py", line 374, in __accept_cookies button[-1].click()


IndexError: list index out of range

With firefox as the browser.
mikecarey134 commented 1 year ago

I have the same issue on Chrome and Firefox in both Windows and Linux systems

kareemrasheed89 commented 1 year ago

Has thi been resolved

AguilarTech commented 1 year ago

getting the same error. Any ideas?

kareemrasheed89 commented 1 year ago

getting the same error. Any ideas?

Not yet, still havent resolved

shaikhsajid1111 commented 1 year ago

I think that popup doesn't appears in every country. That's the very reason it can't find and throw that error. I think it was resolved in #41. I merged that change however haven't updated on PyPi. If you're relying on PyPi than it's not updated yet, you might consider using the source from master branch. @kareemrasheed89 please let me know if you're still facing the same issue with the master branch

kareemrasheed89 commented 1 year ago

I think that popup doesn't appears in every country. That's the very reason it can't find and throw that error. I think it was resolved in #41. I merged that change however haven't updated on PyPi. If you're relying on PyPi than it's not updated yet, you might consider using the source from master branch. @kareemrasheed89 please let me know if you're still facing the same issue with the master branch

still facing the prob. I created a new virtua; env and install from source but still did not get through

kareemrasheed89 commented 1 year ago

(base) C:\Users\USER\Documents\insight>python facebook.py 2022-12-13 17:41:45,591 - facebook_page_scraper.driver_initialization - INFO - Using: username:password@us.smartproxy.com:10001 [WDM] - Driver [C:\Users\USER.wdm\drivers\geckodriver\win64\v0.32.0\geckodriver.exe] found in cache 2022-12-13 17:42:42,897 - facebook_page_scraper.driver_utilities - CRITICAL - No posts were found!

Am getting this error now... after install from source implementation

shaikhsajid1111 commented 1 year ago

This issue doesn't comes under list index error but the posts itself wasn't found. Have you tried setting headless=True?, You might get some idea about the issue when you run in headful mode

mikecarey134 commented 1 year ago

I was able to get this working by downloading repo zip and dropping facebook_page_scraper directory into my project so it uses the latest code in the library.

kareemrasheed89 commented 1 year ago

The headless is actually set to True

kareemrasheed89 commented 1 year ago

Mehn it worked.

Thanks so much

ZamudioSosaAlejandro commented 1 year ago

kareemrasheed89, what worked for you to fix the problem?

kareemrasheed89 commented 1 year ago

kareemrasheed89, what worked for you to fix the problem?

I set the headless to True. Also from the first problem, I installed the library from source not from pipy. I use git clone to install from this github repo

kareemrasheed89 commented 1 year ago

@shaikhsajid1111 I just realize the script couldnt bypass login again, I have done one scraping of a facebook group but it couldnt iterate other facebook group;

(base) C:\Users\USER\Documents\insight>python facebook.py 2022-12-23 06:22:41,576 - facebook_page_scraper.driver_initialization - INFO - Using: username:password@us.smartproxy.com:10001 [WDM] - Driver [C:\Users\USER.wdm\drivers\geckodriver\win64\v0.32.0\geckodriver.exe] found in cache 2022-12-23 06:23:40,277 - facebook_page_scraper.driver_utilities - CRITICAL - No posts were found!

It started showing this error again.

What could be the problem

shaikhsajid1111 commented 1 year ago

If it can't bypass the login page you might consider using a logged-in account but also be aware that it might get blocked. For logged-in, you can pass browser_profile to the Facebook_scraper

kareemrasheed89 commented 1 year ago

Can I see an example of browser profile pls???

From: Sajid @.> Sent: Saturday, December 24, 2022 6:35 AM To: @.> Cc: Rasheed @.>; @.> Subject: Re: [shaikhsajid1111/facebook_page_scraper] List index out of range (Issue #42)

If it can't bypass the login page you might consider using a logged-in account but also be aware that it might get blocked. For logged-in, you can pass browser_profile to the Facebook_scraper

— Reply to this email directly, view it on GitHubhttps://github.com/shaikhsajid1111/facebook_page_scraper/issues/42#issuecomment-1364466146, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AKRIA2GBDW2XWYYCRP5U6ALWO2DTJANCNFSM6AAAAAASNM7CIE. You are receiving this because you were mentioned.Message ID: @.***>

shaikhsajid1111 commented 1 year ago

This will help you get your browserprofile for Firefox. And you can pass it to Facebook_scraper like Facebook_scraper(browser_profile="YOUR BROWSER PROFILE PATH"), pass other arguments as well along with it

kareemrasheed89 commented 1 year ago

Thank you

From: Sajid @.> Sent: Saturday, December 24, 2022 2:18 PM To: @.> Cc: Rasheed @.>; @.> Subject: Re: [shaikhsajid1111/facebook_page_scraper] List index out of range (Issue #42)

This https://support.mozilla.org/en-US/kb/profile-manager-create-remove-switch-firefox-profiles#:~:text=Firefox%20is%20closed.-,Manage%20profiles%20when%20Firefox%20is%20open,%2C%20passwords%2C%20and%20preference%20settings. will help you get your brwoser profile for Firefox. And you can pass it to Facebook_scraper like Facebook_scraper(browser_profile="YOUR BROWSER PROFILE PATH"), pass other argument as well along with it

— Reply to this email directly, view it on GitHubhttps://github.com/shaikhsajid1111/facebook_page_scraper/issues/42#issuecomment-1364529246, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AKRIA2AQWMOEWATRHKKODELWO3ZZTANCNFSM6AAAAAASNM7CIE. You are receiving this because you were mentioned.Message ID: @.***>