shaikhsajid1111 / facebook_page_scraper

Scrapes facebook's pages front end with no limitations & provides a feature to turn data into structured JSON or CSV
https://pypi.org/project/facebook-page-scraper/
MIT License
239 stars 66 forks source link

Only likes and loves are scraped properly #48

Open pitzmoni opened 1 year ago

pitzmoni commented 1 year ago

Hi,

Great tool, congrats! I am using the following code:

from facebook_page_scraper import Facebook_scraper
import os
import stem.process
SOCKS_PORT = 9050
TOR_PATH = os.path.normpath(os.getcwd()+"\\Tor\\tor\\tor.exe")
tor_process = stem.process.launch_tor_with_config(
  config = {
    'SocksPort': str(SOCKS_PORT),
  },
  init_msg_handler = lambda line: print(line) if re.search('Bootstrapped', line) else False,
  tor_cmd = TOR_PATH
)
page_name = "metaai"
posts_count = 10
browser = "firefox"
proxy = "socks5://127.0.0.1:9050"
timeout = 600
headless = False
meta_ai = Facebook_scraper(page_name, posts_count, browser, proxy=proxy, timeout=timeout, headless=headless)
json_data = meta_ai.scrap_to_json()
print(json_data)

Numbers of likes and loves are correct, shares and other reactions seem to be always zero, as for number of comments I am getting a different (lower) number.

Same without proxy.

I am using the tool from Europe with language set to English(UK), although I am not sure about the correct way to select language without using authentication.

image

I would appreciate any advice you may have for me.

shaikhsajid1111 commented 1 year ago

Thanks for raising the issue, the selector was outdated. Pushed some fixes, should work fine

pitzmoni commented 1 year ago

Thanks a lot for fixing shares and comments, it works fine now :)

Regarding the issue 'only likes and loves are correct', I realized it's actually not likes and loves but the top 3 reactions, which are visible without clicking on "See who reacted to this":

image

while other reactions are missing and only visible after clicking:

image

On the other hand you can only click on the button after logging in. So in my understanding, it is not possible to get all the reactions without authentication, but I may consider top 3 as an estimation.

ihabpalamino commented 1 year ago

hello guys if i want to fixe a period of date that i want to scrape how to do its?for example if i want to scrape from 2023/1/1 until 2023/4/13