kevinzg / facebook-scraper

Scrape Facebook public pages without an API key
MIT License
2.47k stars 635 forks source link

problems extracting comments #667

Open Riccardofal opened 2 years ago

Riccardofal commented 2 years ago

First of all, thank you for your work. I have the code updated to the latest changes, however I am having problems with the extraction of comments from public pages. In recent days, I was able to extract only some of the comments present in a post, while currently it only extracts me posts without any comments. What could be the problem? @neon-ninja .

below what I did in get_post:

for post in get_posts('namepage', pages=3, extra_info=True, timeout=600, options={"comments": True, "reactors": True, "allow_extra_requests": True, "posts_per_page": 10}):

neon-ninja commented 2 years ago

What page are you having this problem with?

Riccardofal commented 2 years ago

I have tried with several public pages, it rarely extracts comments from me, if restarto it only extracts me posts. The last one I tried for example OnePiece.it which did not extract me any comments, but only posts, until at some point it caught this exception on a post (which has a long text).

Traceback (most recent call last):

 File "C:\Path\Scraper\facebook_scraper\facebook_scraper.py", line 991, in _generic_get_posts
    post = extract_post_fn(post_element, options=options, request_fn=self.get)
  File "C:\Path\Scraper\facebook_scraper\extractors.py", line 33, in extract_post
    return PostExtractor(raw_post, options, request_fn, full_post_html).extract_post()
  File "C:\Path\Scraper\facebook_scraper\extractors.py", line 198, in extract_post
    if has_more and self.full_post_html:
  File "C:\Path\Scraper\facebook_scraper\extractors.py", line 1267, in full_post_html
    response = self.request(url)
  File "C:\Path\Scraper\facebook_scraper\facebook_scraper.py", line 816, in get
    raise exceptions.LoginRequired(
facebook_scraper.exceptions.LoginRequired: A login (cookies)
 is required to see this page

this exception is only caught when I have a long post.

My question is:

  1. Does the scraper have any limit of detectable characters?
  2. is there a limit of comments that the scraper or facebook scrapes? (which at the moment are less than 50 extracts, among all the tests done)
neon-ninja commented 2 years ago

I see - try pass cookies as per the readme. No, the scraper does not limit - but Facebook servers might.