Open JuanCarlosCamara opened 3 years ago
same error today
Facing the exact same issue...the proxies are dead
yes all the proxies are dead, removing the middleware seems to ban you in minutes
@David-Carrasco is this expected or temporal issue? Do we do something wrong? Thanks
Any idea on how to fix this?
I've tested the scraper again as @JuanCarlosCamara said, it seems Idealista has added some check to block requests of all the proxies which come from https://free-proxy-list.net/
The idea is to provide another list of proxies to be used by the scraper declared in:
It should work since I've tested the scraper with my own ip and it works. It's a proxies issue.
I am still stuck with the same problem. Has anyone got a valid proxy list?
hello, what about this?
https://github.com/clarketm/proxy-list/blob/master/proxy-list.txt
We'd still need to modify the code though
@IsmaPons doesn't seem to be working either. I'm curious to know what the result was back in your first attempt. I'm getting DEAD proxies all over the place. Just now testing this one out.
Apparently the problem is not the proxies themselves (several other lists which are healthy don't work either). The issue here seems to be a 403 returned as the the bot is detected 🤖 . So the proxy middleware keeps trying with no success but it's really the HTTP 403 whats blocking you from success. Even when all settings and variables are correctly set, Google Analytics has apparently has an anti-bot feature 🧐.
Long story short, try
Useful links: SO answer, Splash, User-Agent.
@mikemajara @David-Carrasco did you test with the proposed solution on 7th November? Is it working?
I did to no avail. It did work once. But it's hard to get around it.
Hello,
I´m trying to execute default scrapping from Carabanchel and I´m receiving errors about dead proxies.
I have tried to upload the DOWNLOAD_TIMEOUT parameter in settings.py from 10 to 20 seconds as I have read in other issues and it still returns same error.
Do you have any idea or what could be happening? I don´t know if maybe idealista has included some kind of check or security to avoid scrapping since October.
Thanks a lot for your help. Best regards, Juan Carlos Cámara