The crawler has stopped working!

adigoswami commented 3 years ago

INFO: Going through the "save-device" checkpoint 2020-10-14 18:15:57 [fb] INFO: Scraping facebook page https://mbasic.facebook.com/DonaldTrump 2020-10-14 18:16:00 [scrapy.core.scraper] ERROR: Spider error processing <GET https://mbasic.facebook.com/DonaldTrump> (referer: https://mbasic.facebook.com/?_rdr) Traceback (most recent call last): File "/home/aditya/.local/lib/python3.6/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks result = g.send(result) StopIteration: <200 https://mbasic.facebook.com/DonaldTrump>

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/aditya/.local/lib/python3.6/site-packages/scrapy/utils/defer.py", line 55, in mustbe_deferred result = f(*args, **kw) File "/home/aditya/.local/lib/python3.6/site-packages/scrapy/core/spidermw.py", line 58, in process_spider_input return scrape_func(response, request, spider) File "/home/aditya/.local/lib/python3.6/site-packages/scrapy/core/scraper.py", line 149, in call_spider warn_on_generator_with_return_value(spider, callback) File "/home/aditya/.local/lib/python3.6/site-packages/scrapy/utils/misc.py", line 245, in warn_on_generator_with_return_value if is_generator_with_return_value(callable): File "/home/aditya/.local/lib/python3.6/site-packages/scrapy/utils/misc.py", line 230, in is_generator_with_return_value tree = ast.parse(dedent(inspect.getsource(callable))) File "/usr/lib/python3.6/ast.py", line 35, in parse return compile(source, filename, mode, PyCF_ONLY_AST) File "", line 1 def parse_page(self, response): ^ IndentationError: unexpected indent 2020-10-14 18:16:00 [scrapy.core.engine] INFO: Closing spider (finished) 2020-10-14 18:16:00 [scrapy.statscollectors] INFO: Dumping Scrapy stats: {'downloader/request_bytes': 3872, 'downloader/request_count': 6, 'downloader/request_method_count/GET': 4, 'downloader/request_method_count/POST': 2, 'downloader/response_bytes': 38512, 'downloader/response_count': 6, 'downloader/response_status_count/200': 4, 'downloader/response_status_count/302': 2, 'elapsed_time_seconds': 18.234373, 'finish_reason': 'finished', 'finish_time': datetime.datetime(2020, 10, 14, 22, 16, 0, 857701), 'log_count/ERROR': 1, 'log_count/INFO': 12, 'memusage/max': 57266176, 'memusage/startup': 57266176, 'request_depth_max': 3, 'response_received_count': 4, 'scheduler/dequeued': 6, 'scheduler/dequeued/memory': 6, 'scheduler/enqueued': 6, 'scheduler/enqueued/memory': 6, 'spider_exceptions/IndentationError': 1, 'start_time': datetime.datetime(2020, 10, 14, 22, 15, 42, 623328)} 2020-10-14 18:16:00 [scrapy.core.engine] INFO: Spider closed (finished)

adigoswami commented 3 years ago

When trying the command: scrapy crawl fb -a email="EMAILTOLOGIN" -a password="PASSWORDTOLOGIN" -a page="NAMEOFTHEPAGETOCRAWL" -a date="2018-01-01" -a lang="it" -o DUMPFILE.csv

It results in the error posted earlier.

scsdev-cyber commented 3 years ago

similar issue, not working

rugantio / fbcrawl

The crawler has stopped working! #70