Open m1ngle opened 4 years ago
I also face the same problem .Did you solved it ?
No, frustratingly I still get it. The project is on hold until I can get back to it.
Disclaimer: I'm very new to Scrapy
I tried scraping Amazon.com (following a YouTube video) and I encountered the same error you guys faced. However when I tried it on Quotes to Scrape it's working fine. Could it be Amazon is the one trigger the error?
You might want to try it on another website to confirm this package is running well or giving the same error.
For me also it is working on others site perfectly but not working in amazon
I noticed Amazon now includes an epoch time code in the url of their pages indicating when the page was called. I'm not sure if this is related to the error.
The latest commit that supposed to "upgrade" the policy.py
is what causing the error. Some site works, some are not. As it is an intermittent error (or most likely something that I don't understand) reverting back to version 0.1.7 will make it work again as it does not validate the response.text
https://github.com/hyan15/scrapy-proxy-pool/blob/master/scrapy_proxy_pool/policy.py#L15
For me, it worked. I figured that it returns the exception as the response is empty or something. But as much as I can deduce things, there's really nothing I can do so I figured I should just revert back versions. Hope it'll be fixed!
When I add PROXY_POOL_ENABLED = True and DOWNLOADER_MIDDLEWARES = {
...
} to my settings.py file I am encountering the following error:
AttributeError: Response content isn't text
I attempted this on the website I wanted to scrape, but also on the demo site http://quotes.toscrape.com/ I get the same error each time. I don't think I'm trying to scrape non-text content from this website. Have you ever encountered this?