isaacmg / fb_scraper

FBLYZE is a Facebook scraping system and analysis system.
Apache License 2.0
64 stars 21 forks source link

Replace shelve method #18

Closed isaacmg closed 7 years ago

isaacmg commented 7 years ago

This is a large task, but in the end shelve is just not working the way it should. It is causing the following ` error. scrape(page_id, from_time, useKafka, useES) File "/fb_scraper/fb_scrapper.py", line 60, in scrape pageStamp = get_tstamp(page_id, tstamp, "save_times") File "/fb_scraper/fb_scrapper.py", line 19, in get_tstamp with shelve.open(path) as d: File "/opt/conda/lib/python3.6/shelve.py", line 243, in open return DbfilenameShelf(filename, flag, protocol, writeback) File "/opt/conda/lib/python3.6/shelve.py", line 227, in __init__ Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback) File "/opt/conda/lib/python3.6/dbm/__init__.py", line 94, in open return mod.open(file, flag, mode) AttributeError: module 'dbm.gnu' has no attribute 'open' Moreover it's making it impossible to scale containers without rescraping. Rescraping is likely to happen whenever the image is repulled and shelved file is destroyed.

isaacmg commented 7 years ago

Finally resolved!