kevinzg / facebook-scraper

Scrape Facebook public pages without an API key
MIT License
2.4k stars 627 forks source link

The order of photo is wrong. #304

Closed gaoyunzhi closed 3 years ago

gaoyunzhi commented 3 years ago

Thank you for the library, I really like it.

One small issue: the order of photo is wrong.

Example: https://www.facebook.com/shityoushouldcareabout/posts/923928528449438

Currently the library will return photos in the following order: [0, 1, 2, 3, 9, 8, 7, 6, 5, 4]

neon-ninja commented 3 years ago

This commit (https://github.com/kevinzg/facebook-scraper/commit/4e4ca3d32842d0134dfd8725c3ebf6651c31ad4b) should fix the order for this kind of post, and also make it a bit more efficient. I tried extracting image IDs in the hope they were incremental, but it seems they aren't.

pprint.pprint(list(get_posts(
    post_urls=["https://www.facebook.com/shityoushouldcareabout/posts/923928528449438"]
)))
[{'available': True,
  'comments': 43,
  'comments_full': None,
  'factcheck': None,
  'image': 'https://scontent.fhlz2-1.fna.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/191555842_923928365116121_3211975000799774074_n.jpg?_nc_cat=1&ccb=1-3&_nc_sid=9e2e56&_nc_ohc=uXVGIV7hA9UAX8FyGB9&_nc_ht=scontent.fhlz2-1.fna&tp=14&oh=6ee777700865a0111c5d6a8544051af0&oe=60DB7311',
  'image_id': '923928338449457',
  'image_ids': ['923928338449457',
                '923928345116123',
                '923928335116124',
                '923928358449455',
                '923928328449458',
                '923928348449456',
                '923928331782791',
                '923928351782789',
                '923928341782790',
                '923928355116122'],
  'image_lowquality': 'https://scontent.fhlz2-1.fna.fbcdn.net/v/t1.6435-0/cp0/e15/q65/s320x320/191555842_923928365116121_3211975000799774074_n.jpg?_nc_cat=1&ccb=1-3&_nc_sid=9e2e56&_nc_ohc=uXVGIV7hA9UAX8FyGB9&_nc_ht=scontent.fhlz2-1.fna&tp=9&oh=286dce76c4a741fd15a3ae4e6780420c&oe=60DBA796',
  'images': ['https://scontent.fhlz2-1.fna.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/191555842_923928365116121_3211975000799774074_n.jpg?_nc_cat=1&ccb=1-3&_nc_sid=9e2e56&_nc_ohc=uXVGIV7hA9UAX8FyGB9&_nc_ht=scontent.fhlz2-1.fna&tp=14&oh=6ee777700865a0111c5d6a8544051af0&oe=60DB7311',
             'https://scontent.fhlz2-1.fna.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/193475566_923928385116119_5631179111776189144_n.jpg?_nc_cat=1&ccb=1-3&_nc_sid=9e2e56&_nc_ohc=Wnu_lhALdCcAX-t-FQK&_nc_ht=scontent.fhlz2-1.fna&tp=14&oh=ea1d9775831d6e769899b5837e12ee19&oe=60DBC9CA',
             'https://scontent.fhlz2-1.fna.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/193137803_923928391782785_6467264571047202409_n.jpg?_nc_cat=1&ccb=1-3&_nc_sid=9e2e56&_nc_ohc=GfEMwyudSA8AX9X_02J&_nc_ht=scontent.fhlz2-1.fna&tp=14&oh=f7acbc3c729cc4744f1cdc2506a6d498&oe=60DCC3F8',
             'https://scontent.fhlz2-1.fna.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/193030557_923928388449452_3897708753366394193_n.jpg?_nc_cat=1&ccb=1-3&_nc_sid=9e2e56&_nc_ohc=bRsYt4NeWEUAX-9AiAf&_nc_ht=scontent.fhlz2-1.fna&tp=14&oh=4399528975601861d52f8af0f646f874&oe=60DBF92C',
             'https://scontent.fhlz2-1.fna.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/193647203_923928361782788_4761461811123052454_n.jpg?_nc_cat=108&ccb=1-3&_nc_sid=9e2e56&_nc_ohc=ZNqWAkKe5z0AX975dqf&_nc_ht=scontent.fhlz2-1.fna&tp=14&oh=c269d136c620c92bca0e0e988840d4cf&oe=60DA6FC2',
             'https://scontent.fhlz2-1.fna.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/193137803_923928381782786_7703709365348489027_n.jpg?_nc_cat=107&ccb=1-3&_nc_sid=9e2e56&_nc_ohc=UwvNq7GFImoAX9QivtS&_nc_ht=scontent.fhlz2-1.fna&tp=14&oh=65b9bc653454f6db38a95e0fcf685217&oe=60DA6063',
             'https://scontent.fhlz2-1.fna.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/193922156_923928368449454_6267852204955330474_n.jpg?_nc_cat=104&ccb=1-3&_nc_sid=9e2e56&_nc_ohc=e95KvE5eljQAX_T-e7B&_nc_ht=scontent.fhlz2-1.fna&tp=14&oh=66667ddaea53f0eb593b30ee7d420c30&oe=60D9D907',
             'https://scontent.fhlz2-1.fna.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/194399080_923928378449453_7885271252971789457_n.jpg?_nc_cat=107&ccb=1-3&_nc_sid=9e2e56&_nc_ohc=mX00rhxOiyUAX836mYG&_nc_ht=scontent.fhlz2-1.fna&tp=14&oh=98315407d88e0a894c80e1d2a74befd6&oe=60DAFADE',
             'https://scontent.fhlz2-1.fna.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/193525502_923928371782787_3810577533418732593_n.jpg?_nc_cat=107&ccb=1-3&_nc_sid=9e2e56&_nc_ohc=HFmXPw8LUYEAX_w-ujj&_nc_ht=scontent.fhlz2-1.fna&tp=14&oh=84552ae07fb89f2c91e3db81c580a91f&oe=60DAE30E',
             'https://scontent.fhlz2-1.fna.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/193922156_923928375116120_2686384336859608625_n.jpg?_nc_cat=108&ccb=1-3&_nc_sid=9e2e56&_nc_ohc=QC1-hzF2s9gAX9scnbu&_nc_ht=scontent.fhlz2-1.fna&tp=14&oh=79a158b7bbe24a7c58b9e0ddad267620&oe=60DA8E2D'],
  'images_description': ['May be an image of text that says "A CLOSER LOOK '
                         'REMAINS OF 215 CHILDREN ARE FOUND AT BC RESIDENTIAL '
                         'SCHOOL CONTENT WARNING THIS POST DISCUSSES WRONGFUL '
                         'DEATH ABUSE SUFFERED UNDER RESIDENTIAL SCHOOLS '
                         '#OnCanadaProject Follow @OnCanadaProject for '
                         'Critical, Credible, and Compassionate"',
                         'May be an image of text',
                         'May be an image of text',
                         'May be an image of 1 person and text',
                         'May be an image of 1 person and text',
                         'May be a Twitter screenshot of text',
                         'May be a Twitter screenshot of 1 person and text',
                         'May be an image of text',
                         'May be an image of text',
                         'May be an image of text'],
  'images_lowquality': ['https://scontent.fhlz2-1.fna.fbcdn.net/v/t1.6435-0/cp0/e15/q65/s320x320/191555842_923928365116121_3211975000799774074_n.jpg?_nc_cat=1&ccb=1-3&_nc_sid=9e2e56&_nc_ohc=uXVGIV7hA9UAX8FyGB9&_nc_ht=scontent.fhlz2-1.fna&tp=9&oh=286dce76c4a741fd15a3ae4e6780420c&oe=60DBA796',
                        'https://scontent.fhlz2-1.fna.fbcdn.net/v/t1.6435-0/cp0/e15/q65/s130x130/193475566_923928385116119_5631179111776189144_n.jpg?_nc_cat=1&ccb=1-3&_nc_sid=9e2e56&_nc_ohc=Wnu_lhALdCcAX-t-FQK&_nc_ht=scontent.fhlz2-1.fna&tp=9&oh=26cc0e41d21fb81a3fad670f6914df04&oe=60DBF62C',
                        'https://scontent.fhlz2-1.fna.fbcdn.net/v/t1.6435-0/cp0/e15/q65/s130x130/193137803_923928391782785_6467264571047202409_n.jpg?_nc_cat=1&ccb=1-3&_nc_sid=9e2e56&_nc_ohc=GfEMwyudSA8AX9X_02J&_nc_ht=scontent.fhlz2-1.fna&tp=9&oh=694df23c7d9461a2a1b5cf20c9c2a5f7&oe=60D9CE96',
                        'https://scontent.fhlz2-1.fna.fbcdn.net/v/t1.6435-0/cp0/e15/q65/s130x130/193030557_923928388449452_3897708753366394193_n.jpg?_nc_cat=1&ccb=1-3&_nc_sid=9e2e56&_nc_ohc=bRsYt4NeWEUAX-9AiAf&_nc_ht=scontent.fhlz2-1.fna&tp=9&oh=4f41264e17b11eb30f57e9c5335b4973&oe=60D9A84A'],
  'images_lowquality_description': ['May be an image of text that says "A '
                                    'CLOSER LOOK REMAINS OF 215 CHILDREN ARE '
                                    'FOUND AT BC RESIDENTIAL SCHOOL CONTENT '
                                    'WARNING THIS POST DISCUSSES WRONGFUL '
                                    'DEATH ABUSE SUFFERED UNDER RESIDENTIAL '
                                    'SCHOOLS #OnCanadaProject Follow '
                                    '@OnCanadaProject for Critical, Credible, '
                                    'and Compassionate"',
                                    'May be an image of text',
                                    'May be an image of text',
                                    'May be an image of 1 person and text'],
  'is_live': False,
  'likes': None,
  'link': 'https://www.irsss.ca/about-us',
  'original_request_url': 'https://www.facebook.com/shityoushouldcareabout/posts/923928528449438',
  'post_id': '923928528449438',
  'post_text': 'In Kamloops, Canada, the remains of 215 children, some as '
               'young as three years old, have been found at the site of a '
               'former residential school for indigenous children.\n'
               '\n'
               'The children were students at the Kamloops Indian Residential '
               'School in British Columbia that closed in 1978, according to '
               "the Tk'emlúps te Secwépemc Nation, which said the remains "
               'were found with the help of a ground penetrating radar '
               'specialist.\n'
               '\n'
               'These slides are from @oncanadaproject who you can follow for '
               'more info xx',
  'post_url': 'https://facebook.com/story.php?story_fbid=923928528449438&id=262108697964761',
  'reaction_count': None,
  'reactions': None,
  'reactors': None,
  'shared_post_id': None,
  'shared_post_url': None,
  'shared_text': '',
  'shared_time': None,
  'shared_user_id': None,
  'shared_username': None,
  'shares': 4946,
  'text': 'In Kamloops, Canada, the remains of 215 children, some as young as '
          'three years old, have been found at the site of a former '
          'residential school for indigenous children.\n'
          '\n'
          'The children were students at the Kamloops Indian Residential '
          'School in British Columbia that closed in 1978, according to the '
          "Tk'emlúps te Secwépemc Nation, which said the remains were found "
          'with the help of a ground penetrating radar specialist.\n'
          '\n'
          'These slides are from @oncanadaproject who you can follow for more '
          'info xx',
  'time': datetime.datetime(2021, 5, 30, 21, 50, 51),
  'user_id': '262108697964761',
  'user_url': 'https://facebook.com/shityoushouldcareabout/?__tn__=C-R',
  'username': 'shit you should care about',
  'video': None,
  'video_duration_seconds': None,
  'video_height': None,
  'video_id': None,
  'video_quality': None,
  'video_size_MB': None,
  'video_thumbnail': None,
  'video_watches': None,
  'video_width': None,
  'w3_fb_url': None}]
gaoyunzhi commented 3 years ago

Thank you!