justinlittman / fb-ad-archive-scraper

Scraper for Facebook's Archive of Ads with Political Content
MIT License
35 stars 11 forks source link

CSV coming up empty #2

Closed CaliHoya closed 6 years ago

CaliHoya commented 6 years ago

It looks for line 191, this line (msg = json.loads(entry['message'])) comes up with a JSON that does not have the correct string to actually get the message info.

Here's an example of the JSON I got:

{'message': {'method': 'Network.loadingFinished', 'params': {'blockedCrossSiteDocument': False, 'encodedDataLength': 0, 'requestId': 'DD950D5430F09C4BFEB8424ED5BED7E4', 'timestamp': 818170.436236}}, 'webview': '2D14C6BECC26CE08D2FF9A73AFE80232'}

I think this is the root of why the CSV output is coming up empty.

justinlittman commented 6 years ago

FB completely changed the way the async calls are made. Working on a fix.

gabefried commented 6 years ago

Thanks so much for making this scraper! Are there any updates on this? I will probably have to make my own scraper or pay someone to make one in the next couple days if you think this will take a while to fix.

Thanks!

justinlittman commented 6 years ago

Just made fixes. Let ms know if you encounter any further problems.

gabefried commented 6 years ago

Unfortunately I'm getting a new error. Specifically, the code fails (as in, breaks before it even makes a csv). It looks like the error is somewhere in selenium, although it's hard to really know. Any idea what's happening? Thanks!

"selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"link text","selector":"See Ad Performance"} (Session info: headless chrome=68.0.3440.106) (Driver info: chromedriver=2.41.578706 (5f725d1b4f0a4acbf5259df887244095596231db),platform=Mac OS X 10.13.6 x86_64)"

The full stack trace is here:

File "scraper.py", line 321, in headless=not args.headed, wait=args.wait) File "scraper.py", line 195, in main process_ad_divs(new_ad_divs, len(processed_ad_divs), driver, dirname, ad_limit, wait=wait)) File "scraper.py", line 76, in process_ad_divs ad_div.find_element_by_link_text('See Ad Performance').click() File "/Users/gbankmanfried/miniconda3/envs/civis/lib/python3.5/site-packages/selenium/webdriver/remote/webelement.py", line 241, in find_element_by_link_text return self.find_element(by=By.LINK_TEXT, value=link_text) File "/Users/gbankmanfried/miniconda3/envs/civis/lib/python3.5/site-packages/selenium/webdriver/remote/webelement.py", line 653, in find_element {"using": by, "value": value})['value'] File "/Users/gbankmanfried/miniconda3/envs/civis/lib/python3.5/site-packages/selenium/webdriver/remote/webelement.py", line 628, in _execute return self._parent.execute(command, params) File "/Users/gbankmanfried/miniconda3/envs/civis/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 314, in execute self.error_handler.check_response(response) File "/Users/gbankmanfried/miniconda3/envs/civis/lib/python3.5/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"link text","selector":"See Ad Performance"} (Session info: headless chrome=68.0.3440.106) (Driver info: chromedriver=2.41.578706 (5f725d1b4f0a4acbf5259df887244095596231db),platform=Mac OS X 10.13.6 x86_64)