Closed codmania closed 7 years ago
Do you have any more details I can use to debug the issue with? I'm not having any issues with the library on my end. Can you post some code snippets?
The response of dollargeneral is like as follows.
And here is my code snippet.
def _extract_page_tree(self):
for i in range(3):
try:
with IncapSession() as s:
response = s.get(self.START_URL, headers=self.DEFAULT_HEADERS, bypass_crack=True)
response = s.get(self.product_page_url, headers=self.DEFAULT_HEADERS)
if self.lh:
self.lh.add_log('status_code', response.status_code)
if response.ok:
content = response.text
self.tree_html = html.fromstring(content)
return
else:
self.ERROR_RESPONSE['failure_type'] = response.status_code
except Exception as e:
print 'ERROR EXTRACTING PAGE TREE', self.product_page_url, e
self.is_timeout = True # return failure
Ok, I have "solved" the issue.
The problem was that dollargeneral.com was serving a captcha which wasn't being detected. That issue has now been solved and s.get()
will now raise an IncapBlocked
exception which you can catch.
Should be good to go now.
Thank you, Mark
This cracker does not works for dollargeneral.com which uses incapsula captcha.