ChrisMuir / Zillow

Zillow Scraper for Python using Selenium
162 stars 75 forks source link

Capcha is immediate and impossible to solve #15

Open lionhive opened 6 years ago

lionhive commented 6 years ago

The crawler runs for me, but capcha comes up immediately, and it's very hard to solve. It almost seems like they're using a capcha that is designed to just waste time and not be solveable. Anyone else seeing this problem? Occasionally I can pass the captcha and get some data, but this is very hard to achieve.

ChrisMuir commented 6 years ago

Hi @lionhive

Please see issues #9 and #13. I've seen this in the past, where the CAPTCHA would just reload continuously and never disappear, but it was quite rare. Are you saying this is happening to you most/all of the times the CAPTCHA appears?

laphlaw commented 5 years ago

I too see this as immediate. I tried solving all of them (which took a long time), but after I verified it, it immediately threw up another CAPTCHA. I think its getting smarter unfortunately :(

MathrewLing commented 4 years ago

I have the same problems, once I verified it, it immediately threw up another CAPTCHA. Have you solved this problems??

ChrisMuir commented 4 years ago

@MathrewLing This is a known issue that I'm not planning on trying to fix. I've essentially walked away from this repo. The top of the README includes a note indicating this.

MathrewLing commented 4 years ago

@ChrisMuir Thanks for your reply. I'll keep trying. Thank you anyway.

madkins23 commented 3 years ago

This is happening to me when I hit the site via Chrome (91.0.4472.114 (Official Build) (64-bit)). I am not running a scraper, this is regular old manual access. I was doing fine for 10-20 minutes, just poking around looking at what was available, and then it just locked up on me.

Brought up Firefox and it seems to work OK. No CAPTCHA issues. Weird.

quentinjs commented 1 year ago

SOLVED!!!!!!!!!

You need to make sure cookies can be saved. This got me passed the CAPTCHA for me. It has to be a fully qualified path or Chrome complains.

[Example]

sel_path = os.path.join(os.getcwd(), 'selenium') chrome_options = Options() chrome_options.add_argument("user-data-dir="+ sel_path) chrome_options.add_argument("user-data-dir=selenium") driver = webdriver.Chrome(chrome_options=chrome_options) driver.get(zillow_path)