dragosrotaru / ppeforfree

Collective sensemaking for mutual aid groups manufacturing PPE during COVID.
https://ppeforfree.org
GNU General Public License v3.0
5 stars 5 forks source link

Scraping Facebook Groups Posts #7

Open dragosrotaru opened 4 years ago

dragosrotaru commented 4 years ago

Note on FB Scraping, Data Privacy, Future Roadmap

See #5

Pre-requisite: Seed Data

See #6

Requirements

Scraping Facebook Groups Posts

We will use this data to make a news aggregator and to keep an eye out for more data for coalition-building purposes.

I started a script in scripts/facebook-group-posts-scraper using this library: https://github.com/kevinzg/facebook-scraper

It works ok, but it doesn't work with 100% consistency, you will have to troubleshoot and maybe edit the script.

How your script will store and normalize the data

Database will be MongoDB

Schema

type Post = {
  id: UUID,
  createdAt: TimeStamp,
  text: string,
  link: URL,
  likes: number,
  shares: number,
  comments: number,
  groupID: UUID,
  scrapedAt: TimeStamp,
  scrapeID: UUID,
}

Misc

Random Lib I found: https://github.com/ParvJain/Facebook-Group-Scraper (please look through)