kevinzg / facebook-scraper

Scrape Facebook public pages without an API key
MIT License
2.29k stars 616 forks source link

When extracting comments, there are some bug i found #220

Closed fashan7 closed 3 years ago

fashan7 commented 3 years ago

This Facebook post id facebook.com/3246777818757987 has currently 4 comments but when I scrape using @kevinzg project, it gave some other comments too. Below I'll attach the JSON which gave me from the script.

Note: when we visited via comment id, it will be redirected to a different post. Another thing, the script doesn't scrape replies. please look into this bug @kevinzg and @neon-ninja and @danijoo

{ "post_id":"3246777818757987", "text":"TNM SUPER LEAGUE UPDATE\n\nLeague leaders, Silver Strikers will host Chitipa United at Silver Stadium on Sunday. The Bankers top the standings with 28 points from 12 games.\n\nChitipa United, who lost 1-0 to Kamuzu Barracks on Saturday at Civo Stadium, are on thirteenth position with 13 points from 14 games.\n\nBlue Eagles, who are third from the bottom on the 16 member-log-table with 12 points from 13 games, will host Mafco FC at Nankhaka Stadium.\n\nMafco are on 4th position with 21 games from 13 games as well. #MBCNewsLive", "post_text":"TNM SUPER LEAGUE UPDATE\n\nLeague leaders, Silver Strikers will host Chitipa United at Silver Stadium on Sunday. The Bankers top the standings with 28 points from 12 games.\n\nChitipa United, who lost 1-0 to Kamuzu Barracks on Saturday at Civo Stadium, are on thirteenth position with 13 points from 14 games.\n\nBlue Eagles, who are third from the bottom on the 16 member-log-table with 12 points from 13 games, will host Mafco FC at Nankhaka Stadium.\n\nMafco are on 4th position with 21 games from 13 games as well. #MBCNewsLive", "shared_text":"", "time":"2021:04:24 23:12:57", "image":"None", "images":"None", "video":"None", "video_thumbnail":"None", "video_id":"None", "likes":32, "comments":4, "shares":0, "post_url":"https://facebook.com/mbctv.malawi/posts/3246777818757987", "link":"None", "user_id":"315802248522240", "username":"MBC Malawi", "is_live":false, "factcheck":"None", "shared_post_id":"None", "shared_time":"None", "shared_user_id":"None", "shared_username":"None", "shared_post_url":"None", "available":true, "comments_full":[ { "comment_id":"3246783635424072", "commenter_url":"None", "commenter_name":"Sebastian Moses Moyo", "commenter_meta":"None", "comment_text":"You should be editing your things before posting", "comment_time":"2021:04:24 23:36:23" }, { "comment_id":"3246800085422427", "commenter_url":"None", "commenter_name":"Tsegula Time", "commenter_meta":"None", "comment_text":"Mafco 21 games from 13 games what do u mean?", "comment_time":"2021:04:24 23:36:23" }, { "comment_id":"3246798338755935", "commenter_url":"None", "commenter_name":"Westom Jaguar Bika", "commenter_meta":"None", "comment_text":"21 points from 13 games osati zanuzo", "comment_time":"2021:04:24 23:36:23" }, { "comment_id":"3246908292078273", "commenter_url":"None", "commenter_name":"Lloyd Sato", "commenter_meta":"None", "comment_text":"Bulets ma poits angat", "comment_time":"2021:04:25 00:36:23" }, { "comment_id":"3246783635424072", "commenter_url":"None", "commenter_name":"Sebastian Moses Moyo", "commenter_meta":"None", "comment_text":"You should be editing your things before posting", "comment_time":"2021:04:24 23:36:23" }, { "comment_id":"3246800085422427", "commenter_url":"None", "commenter_name":"Tsegula Time", "commenter_meta":"None", "comment_text":"Mafco 21 games from 13 games what do u mean?", "comment_time":"2021:04:24 23:36:23" }, { "comment_id":"3246798338755935", "commenter_url":"None", "commenter_name":"Westom Jaguar Bika", "commenter_meta":"None", "comment_text":"21 points from 13 games osati zanuzo", "comment_time":"2021:04:24 23:36:23" }, { "comment_id":"3246908292078273", "commenter_url":"None", "commenter_name":"Lloyd Sato", "commenter_meta":"None", "comment_text":"Bulets ma poits angat", "comment_time":"2021:04:25 00:36:23" }, { "comment_id":"3246907608745008", "commenter_url":"None", "commenter_name":"Ganizani Jona Mkombaphala Masiye", "commenter_meta":"None", "comment_text":"Zomachemelela anthu osatha mpira apa kut Meke azawayitane ku AFCON ife takana", "comment_time":"2021:04:25 00:36:24" }, { "comment_id":"3246705708765198", "commenter_url":"None", "commenter_name":"Aaron Msamba", "commenter_meta":"None", "comment_text":"Congratulations", "comment_time":"2021:04:24 22:36:24" }, { "comment_id":"3246758842093218", "commenter_url":"None", "commenter_name":"Andy Bregger Chimasula", "commenter_meta":"None", "comment_text":"To me P. Sambani amayenera kukhara Man-of-the-Matc\nh", "comment_time":"2021:04:24 23:36:24" }, { "comment_id":"3246747165427719", "commenter_url":"None", "commenter_name":"Harry Kaliati", "commenter_meta":"None", "comment_text":"Congratulations\nidana", "comment_time":"2021:04:24 23:36:24" }, { "comment_id":"3246714642097638", "commenter_url":"None", "commenter_name":"Bless Mlera", "commenter_meta":"None", "comment_text":"Precious sambani deserved to be man of the match..", "comment_time":"2021:04:24 23:36:24" }, { "comment_id":"3246702122098890", "commenter_url":"None", "commenter_name":"Isaac Kayira", "commenter_meta":"None", "comment_text":"Tell us all man of the match ,of today's games", "comment_time":"2021:04:24 22:36:24" }, { "comment_id":"3246828178752951", "commenter_url":"None", "commenter_name":"Bonfce J Nior Mbozanani", "commenter_meta":"None", "comment_text":"Ndix more fire man chimzy 🎊🎊", "comment_time":"2021:04:24 23:36:24" }, { "comment_id":"3246724915429944", "commenter_url":"None", "commenter_name":"Finley Nanlaku", "commenter_meta":"None", "comment_text":"Mwantani sadyeka?", "comment_time":"2021:04:24 23:36:24" }, { "comment_id":"3246719025430533", "commenter_url":"None", "commenter_name":"None", "commenter_meta":"None", "comment_text":"Mac Donald Kazingatchire", "comment_time":"2021:04:24 23:36:24" }, { "comment_id":"3246743388761430", "commenter_url":"None", "commenter_name":"Willy Mulenga", "commenter_meta":"None", "comment_text":"Kod amene amalemba nkani pa tsamba iri ndiwa bullets?", "comment_time":"2021:04:24 23:36:24" }, { "comment_id":"3246710558764713", "commenter_url":"None", "commenter_name":"Msungwi Deacon Justin", "commenter_meta":"None", "comment_text":"Amuna", "comment_time":"2021:04:24 23:36:24" }, { "comment_id":"3246750582094044", "commenter_url":"None", "commenter_name":"Dennis Mbotwa", "commenter_meta":"None", "comment_text":"Imeneyi akuti ndi staliyo osati style 😂😂😂🤣", "comment_time":"2021:04:24 23:36:24" }, { "comment_id":"3246892872079815", "commenter_url":"None", "commenter_name":"Hemes Dzanjo", "commenter_meta":"None", "comment_text":"Katundu", "comment_time":"2021:04:25 00:36:24" }, { "comment_id":"3246737785428657", "commenter_url":"None", "commenter_name":"Diston Wys", "commenter_meta":"None", "comment_text":"Koma zalazo mwati che Idana ndichani?", "comment_time":"2021:04:24 23:36:24" }, { "comment_id":"3247009135401522", "commenter_url":"None", "commenter_name":"Chifuniro Mula", "commenter_meta":"None", "comment_text":"Zampira idana", "comment_time":"2021:04:25 01:36:24" }, { "comment_id":"2901603796759004", "commenter_url":"None", "commenter_name":"Yaokon Akundemeka", "commenter_meta":"None", "comment_text":"Watching from Dubai", "comment_time":"2021:04:24 22:36:24" }, { "comment_id":"2901665066752877", "commenter_url":"None", "commenter_name":"Gift Therm T Maloya", "commenter_meta":"None", "comment_text":"crazy", "comment_time":"2021:04:24 23:36:24" }, { "comment_id":"2901685373417513", "commenter_url":"None", "commenter_name":"Mchere Watha", "commenter_meta":"None", "comment_text":"Crazy watching from Lebanon", "comment_time":"2021:04:24 23:36:24" } ], "reactors":"None", "w3_fb_url":"None" }

neon-ninja commented 3 years ago

Looks like this problem only occurs when unauthenticated, ie, not passing cookies. The reason appears to be because in that case, Facebook shows other "related" posts ("Recent post by page").

https://github.com/kevinzg/facebook-scraper/pull/225 should fix this.