kevinzg / facebook-scraper

Scrape Facebook public pages without an API key
MIT License
2.4k stars 627 forks source link

Commenter's time is None #265

Closed fashan7 closed 3 years ago

fashan7 commented 3 years ago

Hi @neon-ninja, I Have passed cookies and got a JSON which the commenter's time is none

`{
   "original_request_url":"https://facebook.com/875365599247603/posts/3855875407863259",
   "post_url":"https://facebook.com/story.php?story_fbid=3855875407863259&id=875365599247603",
   "post_id":"3855875407863259",
   "text":"من #رسائكم\nالسلام عليكم اخي هل هناك مخبر يقوم بفحص Pcr يوم الخميس الأمر مستعجل جدا و شكرا لانني مسافرة يوم السبت ولازم يكون جديد اقل من 72 ولهذا الفحص ضروري في المطار\nاو عاونوني كيفاه ندير خاوتي\n#khaled",
   "post_text":"من #رسائكم\nالسلام عليكم اخي هل هناك مخبر يقوم بفحص Pcr يوم الخميس الأمر مستعجل جدا و شكرا لانني مسافرة يوم السبت ولازم يكون جديد اقل من 72 ولهذا الفحص ضروري في المطار\nاو عاونوني كيفاه ندير خاوتي\n#khaled",
   "shared_text":"",
   "time":datetime.datetime(2021,
   5,
   11,
   6,
   12),
   "image":"None",
   "image_lowquality":"None",
   "images":[

   ],
   "images_description":[

   ],
   "images_lowquality":[

   ],
   "images_lowquality_description":[

   ],
   "video":"None",
   "video_duration_seconds":"None",
   "video_height":"None",
   "video_id":"None",
   "video_quality":"None",
   "video_size_MB":"None",
   "video_thumbnail":"None",
   "video_watches":"None",
   "video_width":"None",
   "likes":32,
   "comments":12,
   "shares":0,
   "link":"None",
   "user_id":"875365599247603",
   "username":"Voix de biskra صوت بسكرة",
   "user_url":"https://facebook.com/Voix.de.biskra/?__tn__=C-R",
   "is_live":false,
   "factcheck":"None",
   "shared_post_id":"None",
   "shared_time":"None",
   "shared_user_id":"None",
   "shared_username":"None",
   "shared_post_url":"None",
   "available":true,
   "comments_full":[
      {
         "comment_id":"3856604677790332",
         "comment_url":"https://facebook.com/3856604677790332",
         "commenter_url":"https://facebook.com/nejma.nuorg?fref=nf&rc=p&refid=52&__tn__=R-R",
         "commenter_name":"Jõūdā Līnē",
         "commenter_meta":"None",
         "comment_text":"مخبر رحومة لي حذا الجامعة",
         "comment_time":"None"
      },
      {
         "comment_id":"3856787874438679",
         "comment_url":"https://facebook.com/3856787874438679",
         "commenter_url":"https://facebook.com/Dr.mina.la?fref=nf&rc=p&refid=52&__tn__=R-R",
         "commenter_name":"Mina La",
         "commenter_meta":"None",
         "comment_text":"مخبر ومان تخرج في يومها لكن المشكلة ممكن ما يخدموش الخميس العيد",
         "comment_time":"None"
      },
      {
         "comment_id":"3855928394524627",
         "comment_url":"https://facebook.com/3855928394524627",
         "commenter_url":"https://facebook.com/hadjer.jojo.169?fref=nf&rc=p&refid=52&__tn__=R-R",
         "commenter_name":"Hadjer La Fleure",
         "commenter_meta":"None",
         "comment_text":"مخبر رحومة في العالية",
         "comment_time":"None"
      },
      {
         "comment_id":"3855890487861751",
         "comment_url":"https://facebook.com/3855890487861751",
         "commenter_url":"https://facebook.com/fleur.desable.9615566?fref=nf&rc=p&refid=52&__tn__=R-R",
         "commenter_name":"Mohamed Mohamed",
         "commenter_meta":"None",
         "comment_text":"مخبر ومان روح صباح بكري قبل 10 وتهزو عشية",
         "comment_time":"None"
      },
      {
         "comment_id":"3858321430951990",
         "comment_url":"https://facebook.com/3858321430951990",
         "commenter_url":"https://facebook.com/ramzi.hecini?fref=nf&rc=p&refid=52&__tn__=R-R",
         "commenter_name":"Ramzi Hecini",
         "commenter_meta":"None",
         "comment_text":"رحت صباح على 8 هزيتو على 11.00 مخبر ومان",
         "comment_time":"None"
      },
      {
         "comment_id":"3855966327854167",
         "comment_url":"https://facebook.com/3855966327854167",
         "commenter_url":"https://facebook.com/ounisoumia?fref=nf&rc=p&refid=52&__tn__=R-R",
         "commenter_name":"Sou Mia",
         "commenter_meta":"None",
         "comment_text":".",
         "comment_time":"None"
      },
      {
         "comment_id":"3855942797856520",
         "comment_url":"https://facebook.com/3855942797856520",
         "commenter_url":"https://facebook.com/anouar.brinis?fref=nf&rc=p&refid=52&__tn__=R-R",
         "commenter_name":"Ānøü Rįtā",
         "commenter_meta":"None",
         "comment_text":".",
         "comment_time":"None"
      },
      {
         "comment_id":"3855942777856522",
         "comment_url":"https://facebook.com/3855942777856522",
         "commenter_url":"https://facebook.com/anouar.brinis?fref=nf&rc=p&refid=52&__tn__=R-R",
         "commenter_name":"Ānøü Rįtā",
         "commenter_meta":"None",
         "comment_text":".",
         "comment_time":"None"
      },
      {
         "comment_id":"3855942804523186",
         "comment_url":"https://facebook.com/3855942804523186",
         "commenter_url":"https://facebook.com/anouar.brinis?fref=nf&rc=p&refid=52&__tn__=R-R",
         "commenter_name":"Ānøü Rįtā",
         "commenter_meta":"None",
         "comment_text":".",
         "comment_time":"None"
      },
      {
         "comment_id":"3855942784523188",
         "comment_url":"https://facebook.com/3855942784523188",
         "commenter_url":"https://facebook.com/anouar.brinis?fref=nf&rc=p&refid=52&__tn__=R-R",
         "commenter_name":"Ānøü Rįtā",
         "commenter_meta":"None",
         "comment_text":".",
         "comment_time":"None"
      },
      {
         "comment_id":"3855942867856513",
         "comment_url":"https://facebook.com/3855942867856513",
         "commenter_url":"https://facebook.com/anouar.brinis?fref=nf&rc=p&refid=52&__tn__=R-R",
         "commenter_name":"Ānøü Rįtā",
         "commenter_meta":"None",
         "comment_text":".",
         "comment_time":"None"
      },
      {
         "comment_id":"3855942871189846",
         "comment_url":"https://facebook.com/3855942871189846",
         "commenter_url":"https://facebook.com/anouar.brinis?fref=nf&rc=p&refid=52&__tn__=R-R",
         "commenter_name":"Ānøü Rįtā",
         "commenter_meta":"None",
         "comment_text":".",
         "comment_time":"None"
      }
   ],
   "reactors":[
      {
         "name":"Voix de biskra صوت بسكرة",
         "link":"https://facebook.com/Voix.de.biskra/?fref=pb",
         "type":"None"
      },
      {
         "name":"Rä Chä",
         "link":"https://facebook.com/profile.php?id=100066010340973&fref=pb",
         "type":"Like"
      },
      {
         "name":"Sįm Ŕaň",
         "link":"https://facebook.com/profile.php?id=100061839220762&fref=pb",
         "type":"Like"
      },
      {
         "name":"Papillon D'or",
         "link":"https://facebook.com/profile.php?id=100057774697305&fref=pb",
         "type":"Like"
      },
      {
         "name":"Han Chi",
         "link":"https://facebook.com/han.chi.10236?fref=pb",
         "type":"Like"
      },
      {
         "name":"الاء الرحمان البتول",
         "link":"https://facebook.com/profile.php?id=100055339550070&fref=pb",
         "type":"Like"
      },
      {
         "name":"Taha Aidoudi",
         "link":"https://facebook.com/taha.aidoudi.982?fref=pb",
         "type":"Like"
      },
      {
         "name":"Ôkba Ķínğ",
         "link":"https://facebook.com/okba.king.714?fref=pb",
         "type":"Like"
      },
      {
         "name":"يونس لبصايرة",
         "link":"https://facebook.com/sahra.ziban?fref=pb",
         "type":"Like"
      },
      {
         "name":"Chabane Chenni",
         "link":"https://facebook.com/chabane.chenni.96?fref=pb",
         "type":"Like"
      },
      {
         "name":"Ānøü Rįtā",
         "link":"https://facebook.com/anouar.brinis?fref=pb",
         "type":"Like"
      },
      {
         "name":"Kassimo Segga",
         "link":"https://facebook.com/marchal.dititch.56?fref=pb",
         "type":"Like"
      },
      {
         "name":"Àdîlø Mìgnõn",
         "link":"https://facebook.com/adil.ricous?fref=pb",
         "type":"Like"
      },
      {
         "name":"Adjal Adjal",
         "link":"https://facebook.com/youness.zendjabil?fref=pb",
         "type":"Like"
      },
      {
         "name":"Da Vinci",
         "link":"https://facebook.com/fax.Davinci.5?fref=pb",
         "type":"Like"
      },
      {
         "name":"ام مجدي بهاء",
         "link":"https://facebook.com/profile.php?id=100025446709161&fref=pb",
         "type":"Like"
      },
      {
         "name":"Mizzo Amine",
         "link":"https://facebook.com/Mozza.mizzo?fref=pb",
         "type":"Like"
      },
      {
         "name":"Rabiaa Zaa",
         "link":"https://facebook.com/profile.php?id=100020976791617&fref=pb",
         "type":"Like"
      },
      {
         "name":"Adouane Charefeddine",
         "link":"https://facebook.com/profile.php?id=100020005092329&fref=pb",
         "type":"Like"
      },
      {
         "name":"ﺍﺣﻼﻡ ﻣﺤﺒﻮﺑﺔ",
         "link":"https://facebook.com/profile.php?id=100019168722288&fref=pb",
         "type":"Like"
      },
      {
         "name":"Fares Bouslit",
         "link":"https://facebook.com/fares.bousslit?fref=pb",
         "type":"Like"
      },
      {
         "name":"Sweeter Smile",
         "link":"https://facebook.com/hana.hanou.9828?fref=pb",
         "type":"Like"
      },
      {
         "name":"صعب المنال",
         "link":"https://facebook.com/profile.php?id=100010346714885&fref=pb",
         "type":"Like"
      },
      {
         "name":"Khaled Bousseria",
         "link":"https://facebook.com/profile.php?id=100010283580174&fref=pb",
         "type":"Like"
      },
      {
         "name":"Âm Īñé",
         "link":"https://facebook.com/profile.php?id=100010247440125&fref=pb",
         "type":"Like"
      },
      {
         "name":"Houssem Kara",
         "link":"https://facebook.com/houssem.lmiringi.5?fref=pb",
         "type":"Like"
      },
      {
         "name":"Mū Řa NĞ",
         "link":"https://facebook.com/profile.php?id=100008575771645&fref=pb",
         "type":"Like"
      },
      {
         "name":"Hayat Rafife",
         "link":"https://facebook.com/kisathayat?fref=pb",
         "type":"Like"
      },
      {
         "name":"Princess Amoula",
         "link":"https://facebook.com/profile.php?id=100006327885105&fref=pb",
         "type":"Like"
      },
      {
         "name":"Ziga El Khloui",
         "link":"https://facebook.com/profile.php?id=100005359542031&fref=pb",
         "type":"Like"
      },
      {
         "name":"Abdou Boukhalfi",
         "link":"https://facebook.com/abdou.boukhalfi.5?fref=pb",
         "type":"Like"
      },
      {
         "name":"ابو منذر",
         "link":"https://facebook.com/fathi.barsa?fref=pb",
         "type":"Like"
      }
   ],
   "w3_fb_url":"https://www.facebook.com/story.php?story_fbid=3855875407863259&id=875365599247603",
   "reactions":{
      "like":32
   },
   "fetched_time":datetime.datetime(2021,
   5,
   17,
   4,
   49,
   54,
   58057)
}`
fashan7 commented 3 years ago

@neon-ninja can we scrape if we have more than 10k comments

fashan7 commented 3 years ago

@neon-ninja i passed cookies but comment is None

{
   "original_request_url":"https://m.facebook.com/story.php?story_fbid=2565983860377395&id=1932987740343680&_rdr",
   "post_url":"https://facebook.com/story.php?story_fbid=2565983860377395&id=1932987740343680",
   "post_id":"2565983860377395",
   "text":"#AfriqueduSud : Des signaux positifs commencent à apparaître sur certains secteurs comme celui de l’#immobilier. Il affiche de solides performances sur la Bourse de Johannesburg.\nEclairage https://bit.ly/3hmiXMG\n\nECOMNEWSAFRIQUE.COM\nAfrique du Sud : Le secteur immobilier sud-africain retrouve des couleurs aux yeux des investisseurs - Ecomnews Afrique",
   "post_text":"#AfriqueduSud : Des signaux positifs commencent à apparaître sur certains secteurs comme celui de l’#immobilier. Il affiche de solides performances sur la Bourse de Johannesburg.\nEclairage https://bit.ly/3hmiXMG",
   "shared_text":"ECOMNEWSAFRIQUE.COM\nAfrique du Sud : Le secteur immobilier sud-africain retrouve des couleurs aux yeux des investisseurs - Ecomnews Afrique",
   "time":datetime.datetime(2021,
   5,
   16,
   14,
   0),
   "image":"https://ecomnewsafrique.com/app/uploads/sites/4/2021/05/immobilier.png",
   "image_lowquality":"https://external-lax3-2.xx.fbcdn.net/safe_image.php?d=AQHjc-CTQRcKK_-e&w=476&h=249&url=https%3A%2F%2Fecomnewsafrique.com%2Fapp%2Fuploads%2Fsites%2F4%2F2021%2F05%2Fimmobilier.png&cfs=1&jq=75&ext=jpg&ccb=3-5&_nc_hash=AQHAR9DX5PvoBi6e",
   "images":[
      "https://ecomnewsafrique.com/app/uploads/sites/4/2021/05/immobilier.png"
   ],
   "images_description":[

   ],
   "images_lowquality":[
      "https://external-lax3-2.xx.fbcdn.net/safe_image.php?d=AQHjc-CTQRcKK_-e&w=476&h=249&url=https%3A%2F%2Fecomnewsafrique.com%2Fapp%2Fuploads%2Fsites%2F4%2F2021%2F05%2Fimmobilier.png&cfs=1&jq=75&ext=jpg&ccb=3-5&_nc_hash=AQHAR9DX5PvoBi6e"
   ],
   "images_lowquality_description":[
      "None"
   ],
   "video":"None",
   "video_duration_seconds":"None",
   "video_height":"None",
   "video_id":"None",
   "video_quality":"None",
   "video_size_MB":"None",
   "video_thumbnail":"None",
   "video_watches":"None",
   "video_width":"None",
   "likes":909,
   "comments":0,
   "shares":2,
   "link":"https://bit.ly/3hmiXMG?fbclid=IwAR0k6JYf5lmdT4YRG-I2hrLhjXwd3hP9pSD-I32W-adzn8QVqNAbYjIXNEk",
   "user_id":"1932987740343680",
   "username":"Ecomnews Afrique",
   "user_url":"https://facebook.com/EcomnewsAfrique/?refid=52&__tn__=C-R",
   "is_live":false,
   "factcheck":"None",
   "shared_post_id":"None",
   "shared_time":"None",
   "shared_user_id":"None",
   "shared_username":"None",
   "shared_post_url":"None",
   "available":true,
   "comments_full":"None",
   "reactors":[
<..SNIP...>
   ],
   "w3_fb_url":"https://www.facebook.com/story.php?story_fbid=2565983860377395&id=1932987740343680",
   "reactions":{
      "like":909,
      "love":4,
      "care":2
   },
   "fetched_time":datetime.datetime(2021,
   5,
   17,
   12,
   52,
   2,
   79405)
}

Edit: removed long reactors output

neon-ninja commented 3 years ago

@neon-ninja can we scrape if we have more than 10k comments

Yes, it just might take a long time. I've added support for tqdm progress bars which might be useful, see the README

neon-ninja commented 3 years ago

Try this commit - https://github.com/kevinzg/facebook-scraper/commit/009b6a73fd4bbae7ca5e874f0e9fd2cc5544ae2f - it should add support for strings like "last Tue"

fashan7 commented 3 years ago

@neon-ninja its not returning the correct time right

https://www.facebook.com/YuliaTymoshenko/posts/4133922966645775?comment_id=4141437682560970

datetime.datetime(2021,5,14,0,0)

Is this because of the FB says 3d

neon-ninja commented 3 years ago
posts = list(get_posts(
    post_urls=[4141437682560970],
    options = {"comments": True},
    timeout = 60,
    cookies = "cookies.txt"
))

for comment in posts[0]["comments_full"]:
    if comment["comment_id"] == "4141437682560970":
        pprint.pprint(comment)

gives

{'comment_id': '4141437682560970',
 'comment_text': 'Дякуємо!',
 'comment_time': datetime.datetime(2021, 5, 14, 0, 0),
 'comment_url': 'https://facebook.com/4141437682560970',
 'commenter_meta': None,
 'commenter_name': 'Ніна Гончар',
 'commenter_url': 'https://facebook.com/profile.php?id=100026655403956&fref=nf&rc=p&refid=52&__tn__=R',
 'replies': [{'comment_id': '4151703778201027',
              'comment_text': 'wow',
              'comment_time': datetime.datetime(2021, 5, 18, 13, 22, 48, 348928),
              'comment_url': 'https://facebook.com/4151703778201027',
              'commenter_meta': None,
              'commenter_name': 'Muhammadh Fashaan',
              'commenter_url': 'https://facebook.com/fashanzak?fref=nf&rc=p&__tn__=R'}]}

It looks correct to me image

neon-ninja commented 3 years ago

Comment extraction for 2565983860377395 worked fine for me too