kevinzg / facebook-scraper

Scrape Facebook public pages without an API key
MIT License
2.4k stars 627 forks source link

Found an issue when scraping. [Comment Count] #237

Closed fashan7 closed 3 years ago

fashan7 commented 3 years ago

When the post has less than 1k comments, it returns the count of the comment to 0

{"comments":0, "shares":0 }

Don't know why shares count too got 0

https://m.facebook.com/story.php?story_fbid=473186744004675&id=1732816076985345

Please check this @kevinzg and @neon-ninja

neon-ninja commented 3 years ago

Working fine for me:

{'available': True,
 'comments': 15,
 'comments_full': None,
 'factcheck': None,
 'image': None,
 'image_lowquality': None,
 'images': [],
 'is_live': False,
 'likes': 26,
 'link': None,
 'post_id': '473186744004675',
 'post_text': "PRESIDENT OF ZIMBABWE, CDE EMMERSON DAMBUDZO MNANGAGWA'S "
              'WORKERS DAY MESSAGE.',
 'post_url': 'https://facebook.com/zanupfparty/posts/473186744004675',
 'reactors': None,
 'shared_post_id': None,
 'shared_post_url': None,
 'shared_text': '',
 'shared_time': None,
 'shared_user_id': None,
 'shared_username': None,
 'shares': 1,
 'text': "PRESIDENT OF ZIMBABWE, CDE EMMERSON DAMBUDZO MNANGAGWA'S WORKERS DAY "
         'MESSAGE.',
 'time': datetime.datetime(2021, 5, 1, 12, 4),
 'user_id': '1732816076985345',
 'user_url': 'https://facebook.com/zanupfparty/?__tn__=C-R',
 'username': 'ZANU PF Party',
 'video': 'https://scontent.fakl1-2.fna.fbcdn.net/v/t66.36281-6/10000000_532000111128016_8256657312909646488_n.mp4?_nc_cat=106&ccb=1-3&_nc_sid=985c63&efg=eyJ2ZW5jb2RlX3RhZyI6Im9lcF9zZCJ9&_nc_ohc=5tVOyWAA8bAAX8RdqQW&_nc_ht=scontent.fakl1-2.fna&oh=55f6f20fe37fe9904b4384d01250b8ce&oe=60B471EF',
 'video_id': '473186744004675',
 'video_thumbnail': 'https://scontent.fakl1-3.fna.fbcdn.net/v/t15.13418-10/cp0/e15/q65/p173x172/127253555_774751493230375_8606612908728745694_n.jpg?_nc_cat=103&ccb=1-3&_nc_sid=ccf8b3&efg=eyJpIjoidCJ9&_nc_ohc=4bL5JvJGvKEAX_qyvDX&_nc_ht=scontent.fakl1-3.fna&tp=3&oh=2d4107205eea1aba1b425ae4fff09638&oe=60B2FC90',
 'w3_fb_url': None}
fashan7 commented 3 years ago

I set cookies and tested. for me not

neon-ninja commented 3 years ago

what version are you using? can you post the code you're using?

fashan7 commented 3 years ago

latest version

neon-ninja commented 3 years ago

With the code:


posts = list(get_posts(
    "zanupfparty",
    page_limit=2,
    cookies="cookies.txt"
))
for post in posts:
    print(post["post_id"], post["time"], post["likes"], post["comments"], post["shares"])

I get

2852284198371855 2021-05-03 00:16:18.114829 26 4 9
2852125795054362 2021-05-02 17:16:23.552286 3 1 1
480783186375409 2021-05-01 13:03:00 15 1 6
473186744004675 2021-05-01 12:04:00 27 15 1
384540912842384 2021-05-01 11:29:00 11 0 6

Do you get the same?

fashan7 commented 3 years ago

with the code: posts = list(get_posts(post_urls=["https://m.facebook.com/story.php?story_fbid=473186744004675&id=1732816076985345"], timeout=60, cookies="cookies.txt", options={'comments': True, 'reactors': True}))

for post in posts:
       print(post["post_id"], post["time"], post["likes"], post["comments"], post["shares"])

I get 473186744004675 2021-05-01 15:34:00 25 0 1

neon-ninja commented 3 years ago

I see - when accessing a post directly like that, FB doesn't provide the comment count (see https://m.facebook.com/story.php?story_fbid=473186744004675&id=1732816076985345 in a browser), however as you're extracting comments, you can just take the len of comments_full, e.g.:

posts = list(get_posts(
    post_urls=["https://m.facebook.com/story.php?story_fbid=473186744004675&id=1732816076985345"],
    timeout=60,
    cookies="cookies.txt",
    options={'comments': True, 'reactors': True}
))

for post in posts:
    print(post["post_id"], post["time"], post["likes"], len(post["comments_full"]), len(post["reactors"]), post["reactions"], post["shares"])
473186744004675 2021-05-01 22:04:00 25 13 27 {'like': 25, 'love': 2} 1
fashan7 commented 3 years ago

Thanks @neon-ninja

neon-ninja commented 3 years ago

https://github.com/kevinzg/facebook-scraper/commit/6574e9682ffc0e41093ba2a52fc90155676472fd might be useful here