kevinzg / facebook-scraper

Scrape Facebook public pages without an API key
MIT License
2.43k stars 628 forks source link

Any plans to support Pages that use the New Pages Experience layout? #412

Open james-darko opened 3 years ago

james-darko commented 3 years ago

Hi,

The title says it all. The new page layout does not have /pg/{account}/posts endpoint. I've noticed many different scrapers use this endpoint. Is there a way to use the regular endpoint such as https://www.facebook.com/Nintendo/?

Thanks.

neon-ninja commented 3 years ago

This scraper attempts to use /posts, and if it fails, falls back to /

james-darko commented 3 years ago

Thanks for the quick reply.

Every account that uses the new layout has not worked for me. Here is a pastebin of two: https://pastebin.com/ZVxcEgkt ( don't want their URLs indexed )

It looks like the / you're using is the mobile layout. When not logged in, those don't show any posts. While the regular desktop page does.

neon-ninja commented 3 years ago

The mobile layout is much easier to work with, programming-wise. If Facebook requires you to login, you need to login then. Pass cookies as per the readme.

james-darko commented 3 years ago

If I updated this to support scraping the regular desktop top endpoint like https://www.facebook.com/Nintendo/ would you accept that in the project?

It makes more sense for us to develop a scraper than maintain a large number of rotating scraping profiles. Given the coding, it looks like relying on a mobile endpoint is hardcoded in. I need to swap all that out with a global object that can either have the FB_MOBILE_BASE_URL or FB_W3_BASE_URL in it to control what's used. Essentially a mobile or desktop mode.

neon-ninja commented 3 years ago

Sure. Good luck.