kevinzg / facebook-scraper

Scrape Facebook public pages without an API key
MIT License
2.36k stars 627 forks source link

Resume scraping functionality #558

Open prakhar-s opened 2 years ago

prakhar-s commented 2 years ago

Hi, I have used get_posts(group=group_id,pages=3 ,options={"comments": True}) and received ~60 posts from the group. Now lets say i have to scrape the data again BUT I want the scraping to begin getting the posts from where it last left scraping (lets say page=4 on wards) , How can I do that? Any help would be greatly appreciated.

Thanks.

neon-ninja commented 2 years ago

This is possible, but you would need to keep track of the pagination URL. See https://github.com/kevinzg/facebook-scraper/issues/310#issuecomment-852652846

prakhar-s commented 2 years ago

Hi, i went through the comment pointed out by you, nut i am still confused with how to get the pagination urls of the subsequent pages automatically through python. Any help would be greatly appreciated. Thanks

neon-ninja commented 2 years ago

The function handle_pagination_url(url) is a callback function - it is called for each pagination URL. As you define the function yourself, you can handle it however you see fit. For example, you could write each pagination URL to a file, or store it in a database.