Data scraper for Facebook Pages, and also code accompanying the blog post How to Scrape Data From Facebook Page Posts for Statistical Analysis
2.12k
stars
661
forks
source link
Rearchitect Script for 14.4x Speedup in Reactions Scraping #10
Closed
minimaxir closed 7 years ago
Scraping reactions is relatively slow for large pages (15 minutes for CNN's FB page) and will get worse as time goes by.
For example, when scraping 100 Statuses:
Current Architecture
The query occurs during processing of the post so no extra data manipulation is necessary.
Better Architecture
The Reaction output from each of the 6 vectors must be mapped to the corresponding post.
101/7 = 14.4x speedup in HTTP, which is the bottleneck.
The challenge is implementing the mapping in a way that is easy to read. Tracking progress with this issue.