Closed GiulioGiorcelli closed 6 years ago
Hi Giulio,
I was using this code to access Zillow for a while and would run into a similar issue. As ChrisMuir points out - scraping is against Zillow's ToS, so they are throwing a CAPTCHA to prevent bots like this one from scraping content. I haven't tried to defeat a CAPTCHA yet - the whole point is to not be beatable by bots.
Using multiple computers - throwing up a bunch of linux virtual machines, basically you're suspicious because of how much searching you're doing and the way the bot interacts with the web page - it's very not human. I don't know how Zillow tracks this but some googling would give you an idea.
Easy solutions:
You can try manually monitoring the machine and interceding when a CAPTCHA appears - manually click around for a while and the site will figure out you are a person. You'd probably have to add code to track how far the bot got in its search before getting stuck.
Use multiple computers and / or IP address to try and fool Zillow
Hi @GiulioGiorcelli, I don't have any good answers for you on this. I honestly haven't had much interest in this project/repo for a while now, so when I added the CAPTCHA code I didn't test it much.....I think I recall what you described happening to me once? And I didn't investigate it at the time. For me, almost all of the instances of CAPTCHA were easy to manually handle (code pauses, I beat the CAPTCHA once, it goes away, code resumes).
The short answer is that once the CAPTCHA appears, it's out of my hands. I have no interest in developing the current CAPTCHA code beyond what it currently is, which is simply to pause code execution indefinitely until the CAPTCHA been manually handled.
Hi @wwetzel, thanks for jumping in with your input and info!
Hi there!
I'm using your software for a personal project and when Zillow throws up a CAPTCHA it takes a really long time and dozens of iterations to get rid of it. I basically complete the CAPTCHA and the page reloads a new one. It goes on for about 10/15 minutes no matter how many times I do it. Do know why this is happening? Is there a workaround to this issue?
Thanks, Giulio