kevinzg / facebook-scraper

Scrape Facebook public pages without an API key
MIT License
2.4k stars 627 forks source link

Post Limit #3

Closed getshaun24 closed 3 years ago

getshaun24 commented 5 years ago

I very much appreciate you work on this. Thank you.

There seems to be a limit of about 300 posts that are able to be collected. How can i remedy this and collect all of a persons posts ?

Thank You

kevinzg commented 5 years ago

I guess it is stopping in this try/except block: https://github.com/kevinzg/facebook-scraper/blob/ac5228a47ebf9d1f57c6db75619bdcb24fc15921/facebook_scraper.py#L54-L59 Can you share that exception and the response status and content?

Also, do you know if you can access more than 300 posts when navigating on a browser? You could also try logging in.

joshfelm commented 4 years ago

The exception is AttributeError: 'NoneType' object has no attribute 'html'

For me, the limit seems to be about 50 pages, and if I try to access more I get this error.

Hansyvea commented 4 years ago

The exception is AttributeError: 'NoneType' object has no attribute 'html' For me, the limit seems to be about 50 pages, and if I try to access more I get this error.

same problem here, have you solved it anyhow?

joshfelm commented 4 years ago

same problem here, have you solved it anyhow?

I actually did manage to find a workaround. If you wrap the _find_and_search function from facebook_scraper.py in a try except block. For example:

def _find_and_search(article, selector, pattern, cast=str):
  try:    
    container = article.find(selector, first=True)  
    match = pattern.search(container.html)  
    return match and cast(match.groups()[0])
  except:
    print("error occurred")
neon-ninja commented 3 years ago

Doesn't seem to be a problem anymore