kevinzg / facebook-scraper

Scrape Facebook public pages without an API key
MIT License
2.29k stars 616 forks source link

UnexpectedResponse: Your request couldn't be processed #864

Open jeffsnack opened 1 year ago

jeffsnack commented 1 year ago

Here is my code,and error massage

I think maybe request frequently,so fb stop me a few time?

Any suggestion?

from facebook_scraper import get_posts,set_cookies
import pandas as pd
from http import cookiejar
import requests
def format_comment(c):
    obj = {
        "comment_id": c["comment_id"],
        "comment_text": c["comment_text"],
        "comment_reaction_count": c["comment_reaction_count"] or 0,
        "reply_count": len(c["replies"]) if "replies" in c else 0,
        "comment_time": c["comment_time"]
    }
    if c["comment_reactions"]:
        obj.update(c["comment_reactions"])
    return obj

fb_comments = []

file = 'facebook.com_cookies.txt'
cookie = cookiejar.MozillaCookieJar()
cookie.load(file)
cookies = requests.utils.dict_from_cookiejar(cookie)
set_cookies(cookies)  #error message

posts = get_posts(group='atomyprobiotics', pages=2, options={"comments":True,"posts_per_page":4})
for post in posts:
    #print(post['comments_full'])
    if post['comments_full']:
        for comment in post['comments_full']:
            fb_comments.append(format_comment(comment))          
            for reply in comment['replies']:
                fb_comments.append(format_comment(reply))
print(fb_comments)

pd.DataFrame(fb_comments).to_csv('fbcomments.csv',index=False)
UnexpectedResponse                        Traceback (most recent call last)
<ipython-input-13-cae0d8291e05> in <module>
     21 cookie.load(file)
     22 cookies = requests.utils.dict_from_cookiejar(cookie)
---> 23 set_cookies(cookies)
     24 
     25 posts = get_posts(group='pythontw', pages=2, options={"comments":True,"posts_per_page":4})

~\Anaconda3\lib\site-packages\facebook_scraper\__init__.py in set_cookies(cookies)
     48             raise exceptions.InvalidCookies(f"Missing cookies with name(s): {missing_cookies}")
     49         _scraper.session.cookies.update(cookies)
---> 50         if not _scraper.is_logged_in():
     51             raise exceptions.InvalidCookies(f"Cookies are not valid")
     52 

~\Anaconda3\lib\site-packages\facebook_scraper\facebook_scraper.py in is_logged_in(self)
    979     def is_logged_in(self) -> bool:
    980         try:
--> 981             self.get('https://m.facebook.com/settings')
    982             return True
    983         except exceptions.LoginRequired:

~\Anaconda3\lib\site-packages\facebook_scraper\facebook_scraper.py in get(self, url, **kwargs)
    897                     raise exceptions.NotFound(title.text)
    898                 elif title.text.lower() == "error":
--> 899                     raise exceptions.UnexpectedResponse("Your request couldn't be processed")
    900                 elif title.text.lower() in temp_ban_titles:
    901                     raise exceptions.TemporarilyBanned(title.text)

UnexpectedResponse: Your request couldn't be processed
joshcbrown commented 1 year ago

Also experiencing this, only downloaded very recently and the issue started on my first request, so I don't think it's a frequency thing.

joshcbrown commented 1 year ago

Have you tried this?

jeffsnack commented 1 year ago

Have you tried this?

Yes,I tried it

I updated the package and it worked yesterday.

But it can't scrape anything today,i don't know what's going on...

tdxius commented 1 year ago

Same thing happening to me. I've been using the scraper for about a year and was all good until a few days ago.

Some details of my scraping job:

I was using v0.2.48 and have now updated it to v0.2.58. Same thing happening in both versions.

LuciaIllari commented 1 year ago

I am also getting the UnexpectedResponse: Your request couldn't be processed message. I pull posts from the same group of public pages about every two weeks using cookies. Currently using version 0.2.56. Also, when I try to get the page info for any page, I'm not getting anything back anymore (example):

print(get_page_info("Nintendo"))
{'reviews': <generator object FacebookScraper.get_page_reviews at 0x00000282339A9CF0>}

even for pages I know I was previously getting results for, and which I have verified are still up and not private.

timsayshey commented 1 year ago

Yep, same issue here. Was grabbing posts from public pages. Worked for almost a year but it's busted now. Guess Facebook caught on.

neon-ninja commented 1 year ago

Try the latest master branch

timsayshey commented 1 year ago

Thanks @neon-ninja - However after updating I am now getting the following response: Facebook says 'Unsupported Browser'

neon-ninja commented 1 year ago

That's just a warning, not an error

tdxius commented 1 year ago

I tried v0.2.59 and works like a charm 🥳 Thank you!

timsayshey commented 1 year ago

I thought this was fixed for me but it's back. I'm on the latest code btw.

Here's my error:

sys:1: UserWarning: A low page limit (<=2) might return no results, try increasing the limit
Traceback (most recent call last):
  File "/mnt/storage/Dropbox/Apps/Instamemes/instameme.py", line 35, in <module>
    for post in get_posts(pageID, pages=1, cookies='/mnt/storage/Dropbox/Apps/Instamemes/cookies.txt'):
  File "/home/tim/.local/lib/python3.9/site-packages/facebook_scraper/facebook_scraper.py", line 1114, in _generic_get_posts
    for i, page in zip(counter, iter_pages_fn()):
  File "/home/tim/.local/lib/python3.9/site-packages/facebook_scraper/page_iterators.py", line 87, in generic_iter_pages
    response = request_fn(next_url)
  File "/home/tim/.local/lib/python3.9/site-packages/facebook_scraper/facebook_scraper.py", line 927, in get
    raise exceptions.UnexpectedResponse("Your request couldn't be processed")
facebook_scraper.exceptions.UnexpectedResponse: Your request couldn't be processed

And here is my code:

import json
import time
from facebook_scraper import get_posts, set_user_agent
set_user_agent("Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36")
pages = [
  '205332729555617'
  ,'195392310760'
]
for pageID in pages:
  for post in get_posts(pageID, pages=1, cookies='/mnt/storage/Dropbox/Apps/Instamemes/cookies.txt'):
    print(json.dumps(post, indent=4, sort_keys=True, default=str))
  # avoid getting banned
  time.sleep( 2 )

Any ideas?

Thanks!

jeffsnack commented 1 year ago

I thought this was fixed for me but it's back. I'm on the latest code btw.

Here's my error:

sys:1: UserWarning: A low page limit (<=2) might return no results, try increasing the limit
Traceback (most recent call last):
  File "/mnt/storage/Dropbox/Apps/Instamemes/instameme.py", line 35, in <module>
    for post in get_posts(pageID, pages=1, cookies='/mnt/storage/Dropbox/Apps/Instamemes/cookies.txt'):
  File "/home/tim/.local/lib/python3.9/site-packages/facebook_scraper/facebook_scraper.py", line 1114, in _generic_get_posts
    for i, page in zip(counter, iter_pages_fn()):
  File "/home/tim/.local/lib/python3.9/site-packages/facebook_scraper/page_iterators.py", line 87, in generic_iter_pages
    response = request_fn(next_url)
  File "/home/tim/.local/lib/python3.9/site-packages/facebook_scraper/facebook_scraper.py", line 927, in get
    raise exceptions.UnexpectedResponse("Your request couldn't be processed")
facebook_scraper.exceptions.UnexpectedResponse: Your request couldn't be processed

And here is my code:

import json
import time
from facebook_scraper import get_posts, set_user_agent
set_user_agent("Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36")
pages = [
  '205332729555617'
  ,'195392310760'
]
for pageID in pages:
  for post in get_posts(pageID, pages=1, cookies='/mnt/storage/Dropbox/Apps/Instamemes/cookies.txt'):
    print(json.dumps(post, indent=4, sort_keys=True, default=str))
  # avoid getting banned
  time.sleep( 2 )

Any ideas?

Thanks!

Maybe you have make your page over 2,I think it can work.

timsayshey commented 1 year ago

@jeffsnack Yeah - that's what fixed it - Never had to have more than one page in the past but glad it's working now