Open ghost opened 7 years ago
Hey!
i've tried to run few times but it fails to log in. My account is using my company SSO engine so the crawling fails with the error message:
2017-07-10 16:38:19 [SafariBooks] ERROR: Failed login 2017-07-10 16:38:19 [scrapy.core.engine] INFO: Closing spider (finished) 2017-07-10 16:38:19 [scrapy.utils.signal] ERROR: Error caught on signal handler: <function close at 0x7f5165725b18> Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line 150, in maybeDeferred result = f(*args, **kw) File "/usr/local/lib/python2.7/dist-packages/pydispatch/robustapply.py", line 55, in robustApply return receiver(*arguments, **named) File "/usr/local/lib/python2.7/dist-packages/scrapy/spiders/__init__.py", line 104, in close return closed(reason) File "/home/hpprszui/safaribooks/safaribook/spiders/safaribooks.py", line 106, in closed shutil.move(self.book_name + '.zip', self.book_name + '.epub') File "/usr/lib/python2.7/shutil.py", line 302, in move copy2(src, real_dst) File "/usr/lib/python2.7/shutil.py", line 130, in copy2 copyfile(src, dst) File "/usr/lib/python2.7/shutil.py", line 82, in copyfile with open(src, 'rb') as fsrc: IOError: [Errno 2] No such file or directory: '.zip' 2017-07-10 16:38:19 [scrapy.statscollectors] INFO: Dumping Scrapy stats: {'downloader/request_bytes': 2482, 'downloader/request_count': 4, 'downloader/request_method_count/GET': 3, 'downloader/request_method_count/POST': 1, 'downloader/response_bytes': 22191, 'downloader/response_count': 4, 'downloader/response_status_count/200': 2, 'downloader/response_status_count/302': 2, 'finish_reason': 'finished', 'finish_time': datetime.datetime(2017, 7, 10, 14, 38, 19, 632179), 'log_count/DEBUG': 5, 'log_count/ERROR': 2, 'log_count/INFO': 7, 'memusage/max': 50954240, 'memusage/startup': 50954240, 'request_depth_max': 1, 'response_received_count': 2, 'scheduler/dequeued': 4, 'scheduler/dequeued/memory': 4, 'scheduler/enqueued': 4, 'scheduler/enqueued/memory': 4, 'start_time': datetime.datetime(2017, 7, 10, 14, 38, 13, 71128)}
Do you have any idea how to override that?
Hi sorry did not see this till now.
Would you let me know if the company login page is https://www.safaribooksonline.com/enterprise/ or the old safari login page?
https://www.safaribooksonline.com/enterprise/
Hey!
i've tried to run few times but it fails to log in. My account is using my company SSO engine so the crawling fails with the error message:
Do you have any idea how to override that?