wang-ye / redfin-scraper

Redfin scraper using filters and proxies
MIT License
24 stars 9 forks source link

Example with proxybroker? #1

Closed ivnle closed 5 years ago

ivnle commented 5 years ago

Hi Wang Ye-- Do you have a working example of how to use redfin-scraper with proxybroker? I generated 50 proxies and saved the ip addresses and ports in a csv. I left the user and password columns blank. For example:

ip_addr port user password
64.33.171.19 8080 NaN NaN
143.0.176.108 60521 NaN NaN

After running python redfin_crawler.py proxies.csv https://www.redfin.com/city/11203/CA/Los-Angeles/, the LISTINGS and LISTING_DETAILS tables are empty. Only the URLS table is getting populated with the redfin_base_url.

Appreciate your help.

wang-ye commented 5 years ago

Hi Lee, I updated the readme about the proxy format. If you do not have proxy auth, you can just remove the user and password columns. However, the use of free proxies often introduces "proxy cannot connect" and "proxy blocked" issues. Premium proxies ($10-$20 per month) can give you a much better performance. If you still want to use the free proxies, make sure you validate the proxies first with tools/proxy_checker.py.

For the empty listings/listing_details table, it is because you choose the default mode "pages". You can run the program with "properties" mode. Example:

python redfin_crawler.py good_proxies.csv https://www.redfin.com/city/1362/CA/Belmont
--property_prefix https://www.redfin.com/city/1362/CA/Belmont --type properties