moda20 / facebook-scraper

Scrape Facebook public pages without an API key
MIT License
77 stars 28 forks source link

No raw posts (<article> elements) were found in this page. #21

Closed talatoncu closed 9 months ago

talatoncu commented 9 months ago

I was using scraper from kevingz perfectly upto today.

Today, I got the message "No raw posts (

elements) were found in this page."

I installed moda20 version as explained in the top page https://github.com/moda20/facebook-scraper

Unfortunately same error exists.

I will be very happy if you can help me.

Thanks and regards.

My command line is as follows:

facebook-scraper talatoncu --cookies from_browser --pages 5 --format json --encoding utf-8 --no-extra-requests --keys post_id,post_text,time,post_url --timeout 200 -v

I can use get_posts as in the example perfectly.

moda20 commented 9 months ago

@talatoncu I don't think the command line is supported by this repo, i exclusively use a script to debug and test and use the scrapper itself. I however figured out that facebook has changed some of their html to not user

anymore, so i updated the repo to handle that too. So try the example form the ReadMe to see if it works for you

talatoncu commented 9 months ago

@moda20 Thank you very much for your response.

  1. I force-reinstalled the app as you mentioned in readme

pip install --force-reinstall --no-deps git+https://github.com/moda20/facebook-scraper.git@master

  1. I ran the example from readme

facebook-scraper --filename nintendo_page_posts.csv --pages 10 nintendo

I got the message

Couldn't get any posts.

talatoncu commented 9 months ago

@moda20 Please forgive me if I am wrong.

Probably the script should check "tn" keys from "data-ft" attribute since facebook doesn't put any "article" tag in pages. Since there are no "article" tag surrounding the posts, the script can't find any post in the pages.

"tn" : "*s" is for posttext "tn" : "-R" is for footer etc.

I think individual post pages still have "article" tag.

moda20 commented 9 months ago

@talatoncu can you try to use a python script to test your example instead of the command line feature ? the command line option is not updated according to the latest needs of facebook

talatoncu commented 9 months ago

@moda20 I used the source in Github and tried the following script:

from facebook_scraper import FacebookScraper _scraper = FacebookScraper() import json

for post in _scraper.get_posts('NintendoAmerica', base_url="https://mbasic.facebook.com", start_url="https://mbasic.facebook.com/NintendoAmerica?v=timeline", pages=1): print(post['text'][:50])

Still "No raw posts"

willlee88 commented 9 months ago

@talatoncu maybe you can try this first I added a fb cookie file to the example code and can get post well.

from facebook_scraper import get_posts for post in get_posts('NintendoAmerica', base_url="https://mbasic.facebook.com", start_url="https://mbasic.facebook.com/NintendoAmerica?v=timeline", cookies=self.fb_cookie_path, pages=3): print(post)

talatoncu commented 9 months ago

@willlee88 thank you very much for your help.

Did you use the installed package or the py files in github.

When I use the py files, I got the error

ImportError: cannot import name 'get_posts' from 'facebook_scraper' (G:\facebook-scraper-python-master\facebook-scraper-master\facebook_scraper\facebook_scraper.py)

talatoncu commented 9 months ago

@willlee88

I tried with the packaged version. putting "start_url" parameter solved the problem.

Thank you and @moda20 very much.

I think updating github readme file will help everybody and also your valuable time will be saved.

Regards.

moda20 commented 9 months ago

@talatoncu Great you solved your problem, the readme file has the latest update in fact, use this repo as your source. you can find a way to get this specific repo version and to update it also in the same readme file