thekingofkings / embedding

Dynamic graph embedding
MIT License
11 stars 7 forks source link

Zillow crawler stopped working #9

Closed thekingofkings closed 6 years ago

thekingofkings commented 6 years ago

Zillow crawler cannot crawl data anymore

Issue

Now the crawler get a page asking for recaptcha, because it recognize the python requests not from a browser.

Solution

Mimic browser request by adding User-Agent in the request header?

thekingofkings commented 6 years ago

The zillow also tracts the request frequency and block any IP making dense requests (dim as robot).

Solution: add a time.sleep() to lower the request time.