Closed fashan7 closed 3 years ago
@neon-ninja can we scrape if we have more than 10k comments
@neon-ninja i passed cookies but comment is None
{
"original_request_url":"https://m.facebook.com/story.php?story_fbid=2565983860377395&id=1932987740343680&_rdr",
"post_url":"https://facebook.com/story.php?story_fbid=2565983860377395&id=1932987740343680",
"post_id":"2565983860377395",
"text":"#AfriqueduSud : Des signaux positifs commencent à apparaître sur certains secteurs comme celui de l’#immobilier. Il affiche de solides performances sur la Bourse de Johannesburg.\nEclairage https://bit.ly/3hmiXMG\n\nECOMNEWSAFRIQUE.COM\nAfrique du Sud : Le secteur immobilier sud-africain retrouve des couleurs aux yeux des investisseurs - Ecomnews Afrique",
"post_text":"#AfriqueduSud : Des signaux positifs commencent à apparaître sur certains secteurs comme celui de l’#immobilier. Il affiche de solides performances sur la Bourse de Johannesburg.\nEclairage https://bit.ly/3hmiXMG",
"shared_text":"ECOMNEWSAFRIQUE.COM\nAfrique du Sud : Le secteur immobilier sud-africain retrouve des couleurs aux yeux des investisseurs - Ecomnews Afrique",
"time":datetime.datetime(2021,
5,
16,
14,
0),
"image":"https://ecomnewsafrique.com/app/uploads/sites/4/2021/05/immobilier.png",
"image_lowquality":"https://external-lax3-2.xx.fbcdn.net/safe_image.php?d=AQHjc-CTQRcKK_-e&w=476&h=249&url=https%3A%2F%2Fecomnewsafrique.com%2Fapp%2Fuploads%2Fsites%2F4%2F2021%2F05%2Fimmobilier.png&cfs=1&jq=75&ext=jpg&ccb=3-5&_nc_hash=AQHAR9DX5PvoBi6e",
"images":[
"https://ecomnewsafrique.com/app/uploads/sites/4/2021/05/immobilier.png"
],
"images_description":[
],
"images_lowquality":[
"https://external-lax3-2.xx.fbcdn.net/safe_image.php?d=AQHjc-CTQRcKK_-e&w=476&h=249&url=https%3A%2F%2Fecomnewsafrique.com%2Fapp%2Fuploads%2Fsites%2F4%2F2021%2F05%2Fimmobilier.png&cfs=1&jq=75&ext=jpg&ccb=3-5&_nc_hash=AQHAR9DX5PvoBi6e"
],
"images_lowquality_description":[
"None"
],
"video":"None",
"video_duration_seconds":"None",
"video_height":"None",
"video_id":"None",
"video_quality":"None",
"video_size_MB":"None",
"video_thumbnail":"None",
"video_watches":"None",
"video_width":"None",
"likes":909,
"comments":0,
"shares":2,
"link":"https://bit.ly/3hmiXMG?fbclid=IwAR0k6JYf5lmdT4YRG-I2hrLhjXwd3hP9pSD-I32W-adzn8QVqNAbYjIXNEk",
"user_id":"1932987740343680",
"username":"Ecomnews Afrique",
"user_url":"https://facebook.com/EcomnewsAfrique/?refid=52&__tn__=C-R",
"is_live":false,
"factcheck":"None",
"shared_post_id":"None",
"shared_time":"None",
"shared_user_id":"None",
"shared_username":"None",
"shared_post_url":"None",
"available":true,
"comments_full":"None",
"reactors":[
<..SNIP...>
],
"w3_fb_url":"https://www.facebook.com/story.php?story_fbid=2565983860377395&id=1932987740343680",
"reactions":{
"like":909,
"love":4,
"care":2
},
"fetched_time":datetime.datetime(2021,
5,
17,
12,
52,
2,
79405)
}
Edit: removed long reactors output
@neon-ninja can we scrape if we have more than 10k comments
Yes, it just might take a long time. I've added support for tqdm progress bars which might be useful, see the README
Try this commit - https://github.com/kevinzg/facebook-scraper/commit/009b6a73fd4bbae7ca5e874f0e9fd2cc5544ae2f - it should add support for strings like "last Tue"
@neon-ninja its not returning the correct time right
https://www.facebook.com/YuliaTymoshenko/posts/4133922966645775?comment_id=4141437682560970
datetime.datetime(2021,5,14,0,0)
Is this because of the FB says 3d
posts = list(get_posts(
post_urls=[4141437682560970],
options = {"comments": True},
timeout = 60,
cookies = "cookies.txt"
))
for comment in posts[0]["comments_full"]:
if comment["comment_id"] == "4141437682560970":
pprint.pprint(comment)
gives
{'comment_id': '4141437682560970',
'comment_text': 'Дякуємо!',
'comment_time': datetime.datetime(2021, 5, 14, 0, 0),
'comment_url': 'https://facebook.com/4141437682560970',
'commenter_meta': None,
'commenter_name': 'Ніна Гончар',
'commenter_url': 'https://facebook.com/profile.php?id=100026655403956&fref=nf&rc=p&refid=52&__tn__=R',
'replies': [{'comment_id': '4151703778201027',
'comment_text': 'wow',
'comment_time': datetime.datetime(2021, 5, 18, 13, 22, 48, 348928),
'comment_url': 'https://facebook.com/4151703778201027',
'commenter_meta': None,
'commenter_name': 'Muhammadh Fashaan',
'commenter_url': 'https://facebook.com/fashanzak?fref=nf&rc=p&__tn__=R'}]}
It looks correct to me
Comment extraction for 2565983860377395 worked fine for me too
Hi @neon-ninja, I Have passed cookies and got a JSON which the commenter's time is none