hhsm95 / FacebookPostsScraper

Scraper for posts in Facebook user profiles, pages and groups
MIT License
140 stars 55 forks source link

KeyError: 'story_fbid' #8

Open Rahulsunny11 opened 4 years ago

Rahulsunny11 commented 4 years ago

Hey, I ran your scripts as python main.py and I am got error given below: File "main.py", line 25, in main data = fps.get_posts_from_list(profiles) File "C:\Users\ADMIN\Downloads\Compressed\FacebookPostsScraper-master\FacebookPostsScraper.py", line 133, in get_posts_from_list posts = self.get_posts_from_profile(profile) File "C:\Users\ADMIN\Downloads\Compressed\FacebookPostsScraper-master\FacebookPostsScraper.py", line 193, in get_posts_from_profile post_url = f'{p_url.scheme}://{p_url.hostname}{p_url.path}?story_fbid={qs["story_fbid"][0]}&id={qs["id"][0]}' KeyError: 'story_fbid' also can you tell f'{p_url.scheme}://{p_url.hostname}{p_url.path}?story_fbid={qs["story_fbid"][0]}&id={qs["id"][0]} where did this come from

It would great help if you are also including reactions, likes, haha, sad, comments, shares column

thank you

rsorma04 commented 4 years ago

So, I experienced the same issue. Within the library's file (FacebookPostsScraper) when they are handling cleaning the post link (lines 185 to 197), they are only handling the case of the story_fbid. There are posts where an 'fbid' needs to be handled; in other words, there is no story_fbid - only an fbid.

I altered the script by adding a variable called id_key that captures the first key name returned from the parse_qs queryl and added another condition where you see the elif block. This handled the issue for me. Now, there may be others that are not handled, but for this scenario, it handled well:

# Clean the post link
            if post_url is not None:
                post_url = post_url.get('href', '')
                if len(post_url) > 0:
                    post_url = f'https://www.facebook.com{post_url}'
                    p_url = urlparse(post_url)
                    qs = parse_qs(p_url.query)
                    id_key = next(iter(qs))
                    if not is_group and id_key == 'story_fbid':
                        post_url = f'{p_url.scheme}://{p_url.hostname}{p_url.path}?story_fbid={qs["story_fbid"][0]}&id={qs["id"][0]}'

                    elif not is_group and id_key == 'fbid':
                            post_url = f'{p_url.scheme}://{p_url.hostname}{p_url.path}?fbid={qs["fbid"][0]}&id={qs["id"][0]}'

                    else:
                        post_url = f'{p_url.scheme}://{p_url.hostname}{p_url.path}/permalink/{qs["id"][0]}/'
            else:
                post_url = ''
NicoSerranoP commented 3 years ago

As of January 2020 @rsorma04 's solution worked for me! Thank you :D