Closed streamcon closed 9 months ago
@streamcon teh app uses m.fb by default and that doesn't work anymore, we are using mbasic for everything now. please use the get_posts like this, specifically the base_url and start_url arguments :
for post in get_posts('nintendo', base_url="https://mbasic.facebook.com", start_url="https://mbasic.facebook.com/nintendo?v=timeline", pages=1):
... print(post['text'][:50])
or add
re.findall(r"share_id\":([\d:\"]*)", self.element.attrs["data-store"])
in def extract_post_id in extractors.py
@streamcon that only works for m.facebook and we don't know exactly how that interacts with the other posts, for groups, single posts and so on. the easiest way to get the repo to work again is to use the mbasic attributes. besides the fb website changes based on the cookies you are using but it seems not to be the case for mbasic, so i think we should stick with it for now
but
from facebook_scraper import get_posts, set_user_agent
set_user_agent("Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)")
for post in get_posts('nintendo', base_url="https://mbasic.facebook.com", start_url="https://mbasic.facebook.com/nintendo?v=timeline", pages=3, cookies='cookie2.json'):
print(post['text'][:50])
print(post)
output
Thank you to all for enjoying The Legend of Zelda:
{'post_id': None, 'text': 'Thank you to all for enjoying The Legend of Zelda: Tears of the Kingdom and voting for it at The Game Awards this year!', 'post_text': 'Thank you to all for enjoying The Legend of Zelda: Tears of the Kingdom and voting for it at The Game Awards this year!', 'shared_text': '', 'original_text': None, 'time': datetime.datetime(2023, 12, 8, 9, 32), 'timestamp': None, 'image': 'https://scontent-arn2-1.xx.fbcdn.net/v/t39.30808-6/406464103_739497268205878_6182010635183516327_n.jpg?stp=cp0_dst-jpg_e15_fr_q65&_nc_cat=104&ccb=1-7&_nc_sid=ab7367&efg=eyJpIjoidCJ9&_nc_ohc=n-lrDah-bq8AX-7FHsB&_nc_ht=scontent-arn2-1.xx&oh=00_AfDbrpaMe8Y6DOF-JO8BpZMv0n28I2R9Nro7y4TShKmLTw&oe=657A70BE&manual_redirect=1', 'image_lowquality': 'https://scontent-arn2-1.xx.fbcdn.net/v/t39.30808-6/406464103_739497268205878_6182010635183516327_n.jpg?stp=cp0_dst-jpg_e15_q65_s1080x2048&_nc_cat=104&ccb=1-7&_nc_sid=ab7367&efg=eyJpIjoiYiJ9&_nc_ohc=n-lrDah-bq8AX-7FHsB&_nc_ht=scontent-arn2-1.xx&oh=00_AfDvtioHQrAozlXs7B6he1imqokRmJfftRXN1ncIH0Q7XQ&oe=657A70BE', 'images': ['https://scontent-arn2-1.xx.fbcdn.net/v/t39.30808-6/406464103_739497268205878_6182010635183516327_n.jpg?stp=cp0_dst-jpg_e15_fr_q65&_nc_cat=104&ccb=1-7&_nc_sid=ab7367&efg=eyJpIjoidCJ9&_nc_ohc=n-lrDah-bq8AX-7FHsB&_nc_ht=scontent-arn2-1.xx&oh=00_AfDbrpaMe8Y6DOF-JO8BpZMv0n28I2R9Nro7y4TShKmLTw&oe=657A70BE&manual_redirect=1'], 'images_description': ["May be a graphic of \u200etext that says '\u200eTHEGAME GAME AWARDS WINNER BEST ACTION/ADVENTURE GAME ZELDA ۔ KINGDOM ÛHELEED ELDA TEARS\u200e'\u200e"], 'images_lowquality': ['https://scontent-arn2-1.xx.fbcdn.net/v/t39.30808-6/406464103_739497268205878_6182010635183516327_n.jpg?stp=cp0_dst-jpg_e15_q65_s1080x2048&_nc_cat=104&ccb=1-7&_nc_sid=ab7367&efg=eyJpIjoiYiJ9&_nc_ohc=n-lrDah-bq8AX-7FHsB&_nc_ht=scontent-arn2-1.xx&oh=00_AfDvtioHQrAozlXs7B6he1imqokRmJfftRXN1ncIH0Q7XQ&oe=657A70BE'], 'images_lowquality_description': ["May be a graphic of \u200etext that says '\u200eTHEGAME GAME AWARDS WINNER BEST ACTION/ADVENTURE GAME ZELDA ۔ KINGDOM ÛHELEED ELDA TEARS\u200e'\u200e"], 'video': None, 'video_duration_seconds': None, 'video_height': None, 'video_id': None, 'video_quality': None, 'video_size_MB': None, 'video_thumbnail': None, 'video_watches': None, 'video_width': None, 'likes': 10000, 'comments': 563, 'shares': 1600, 'post_url': 'https://facebook.com/story.php?story_fbid=pfbid02WVPVEeJ7jDRMr4WwSSMMb7PRqNfqqBJZs4bB4aBJ3XvgMiTxrsHKZVTaBqukkJ8Ll&id=100064368354094', 'link': None, 'links': [{'link': '/story.php?story_fbid=pfbid02WVPVEeJ7jDRMr4WwSSMMb7PRqNfqqBJZs4bB4aBJ3XvgMiTxrsHKZVTaBqukkJ8Ll&id=100064368354094&eav=AfaFP0sp3WXHigA2Xieq7BQRoCEIqyLRMXWpQRoCfxAevayQPgZCN95cnUlonILq7_c&m_entstream_source=timeline&refid=17&paipv=0', 'text': ''}, {'link': 'https://mbasic.facebook.com/photo.php?fbid=739497278205877&id=100064368354094&set=a.604604228361850&eav=AfbOXv5LyoT2YNADccGFlF2NO0mw-g-5D5xLOYto3MKnM00-P2sDakgyz1cYmlZ4r8c&paipv=0&source=48&refid=17', 'text': ''}], 'user_id': None, 'username': 'Nintendo of America', 'user_url': 'https://facebook.com/NintendoAmerica/?lst=61553879472338%3A100064368354094%3A1702151109&eav=AfZnk4h6lYaEnb7oa3qnX-3rgBwgbXkiWxWlUyk2m-bjAsGyQZfZLSNaguhIqY6QBZ0&refid=17&paipv=0', 'is_live': False, 'factcheck': None, 'shared_post_id': None, 'shared_time': None, 'shared_user_id': None, 'shared_username': None, 'shared_user_url': None, 'shared_post_url': None, 'available': True, 'comments_full': None, 'reactors': None, 'w3_fb_url': None, 'reactions': None, 'reaction_count': 10000, 'with': None, 'page_id': None, 'sharers': None, 'translated_text': '', 'image_id': '739497278205877', 'image_ids': ['739497278205877'], 'was_live': False}
The Super Smash Bros. amiibo of Kingdom Hearts’ So
{'post_id': None, 'text': 'The Super Smash Bros. amiibo of Kingdom Hearts’ Sora will be released on February 16th 2024!', 'post_text': 'The Super Smash Bros. amiibo of Kingdom Hearts’ Sora will be released on February 16th 2024!', 'shared_text': '', 'original_text': None, 'time': datetime.datetime(2023, 12, 6, 17, 7), 'timestamp': None, 'image': None, 'image_lowquality': 'https://scontent-arn2-1.xx.fbcdn.net/v/t15.5256-10/408311684_322940160667403_2458471000675836479_n.jpg?stp=cp0_dst-jpg_e15_p720x720_q65&_nc_cat=104&ccb=1-7&_nc_sid=f3b36a&efg=eyJpIjoiYiJ9&_nc_ohc=N71w-AFx96UAX_aKd6Z&tn=7WMAHMjd06AW0FGV&_nc_ht=scontent-arn2-1.xx&oh=00_AfBYaih_YakmXASj3Ad_g-eYNfMDEogie2Vkj60hhTmv_g&oe=65792611', 'images': [], 'images_description': [], 'images_lowquality': ['https://scontent-arn2-1.xx.fbcdn.net/v/t15.5256-10/408311684_322940160667403_2458471000675836479_n.jpg?stp=cp0_dst-jpg_e15_p720x720_q65&_nc_cat=104&ccb=1-7&_nc_sid=f3b36a&efg=eyJpIjoiYiJ9&_nc_ohc=N71w-AFx96UAX_aKd6Z&tn=7WMAHMjd06AW0FGV&_nc_ht=scontent-arn2-1.xx&oh=00_AfBYaih_YakmXASj3Ad_g-eYNfMDEogie2Vkj60hhTmv_g&oe=65792611'], 'images_lowquality_description': [None], 'video': 'https://scontent-arn2-1.xx.fbcdn.net/v/t42.1790-2/407540787_382017990845112_4737937475628955811_n.mp4?_nc_cat=105&ccb=1-7&_nc_sid=55d0d3&efg=eyJybHIiOjY2MiwicmxhIjo1MTIsInZlbmNvZGVfdGFnIjoic3ZlX3NkIn0%3D&_nc_ohc=xB5Xt9_W-8oAX_OY3Te&_nc_rml=0&_nc_ht=scontent-arn2-1.xx&oh=00_AfCsAoRLuk244UQGHqrlANW-kAlY4kJf1smqBgrLeiiAFg&oe=65793E2D', 'video_duration_seconds': None, 'video_height': None, 'video_id': '709312260916437', 'video_quality': None, 'video_size_MB': None, 'video_thumbnail': None, 'video_watches': None, 'video_width': None, 'likes': 2600, 'comments': 279, 'shares': 754, 'post_url': 'https://facebook.com/story.php?story_fbid=pfbid01GEJp4CX1PGoqXNZponB2wFqQviS86NS7dwHzRbb8Zhghu1CxYJKt3XoKbWSachpl&id=100064368354094', 'link': None, 'links': [{'link': '/story.php?story_fbid=pfbid01GEJp4CX1PGoqXNZponB2wFqQviS86NS7dwHzRbb8Zhghu1CxYJKt3XoKbWSachpl&id=100064368354094&eav=Afb2UR5DHWHROPOE9ku0ltfM7c4hmYogVKFN_EEGRrctG47_ut8XIHLz26f5BTavCL4&m_entstream_source=timeline&refid=17&paipv=0', 'text': ''}], 'user_id': None, 'username': 'Nintendo of America', 'user_url': 'https://facebook.com/NintendoAmerica/?lst=61553879472338%3A100064368354094%3A1702151109&eav=AfZnk4h6lYaEnb7oa3qnX-3rgBwgbXkiWxWlUyk2m-bjAsGyQZfZLSNaguhIqY6QBZ0&refid=17&paipv=0', 'is_live': False, 'factcheck': None, 'shared_post_id': None, 'shared_time': None, 'shared_user_id': None, 'shared_username': None, 'shared_user_url': None, 'shared_post_url': None, 'available': True, 'comments_full': None, 'reactors': None, 'w3_fb_url': None, 'reactions': None, 'reaction_count': 2600, 'with': None, 'page_id': None, 'sharers': None, 'translated_text': '', 'image_id': None, 'image_ids': [], 'was_live': False}
post_id still unavailable
@streamcon i tried it now and it worked, but i did find an issue where using nintendo would automatically redirect to NintendoAmerica which might not return the right result, try again with 'NintendoAmerica' instead of just 'nintendo'. Otherwise i will ask you to get new cookies
@moda20 I run example:
for post in scraper.get_posts('NintendoAmerica', base_url="https://mbasic.facebook.com", start_url="https://mbasic.facebook.com/NintendoAmerica?v=timeline", pages=1):
print(post)
And I didn't get anything. Use the last version from repo.
@chelishchev Please chekc your cookies and try other pages to see if it's still an issue,
@chelishchev I am closing this issue as it seems to be fixed
I can't get post_id when parsing
My code:
output: