brutalsavage / facebook-post-scraper

Facebook Post Scraper 🕵️🖱️
GNU General Public License v3.0
324 stars 116 forks source link

Added additional data collection capabilities and fixed bugs in scraper.py #27

Open diehl opened 4 years ago

diehl commented 4 years ago

Additional data elements that are now collected per post:

Fixed a bug in the collection of comment threads. In the previous implementation, comment text was saved in dictionaries that were indexed by the comment author. This would result in dropped content when the same FB user would post multiple times in the comment thread.

The code has been refactored a bit as well to allow the contents of the web scraping to be read from disk and parsed. The contents of the web scraping is saved to disk prior to parsing in case there's an error downstream. This allows for subsequent debugging.