Closed ddxv closed 1 year ago
Try a proxy with proxy=, also you can use tryhomeharvest.com
I see, I don't have a proxy on hand at the moment, but just curious, do you think changing the UA that homeharvest is using would help? What UA is it currently set to? I have only used the library once a couple days ago, so would be surprised if the blocking rule is based on IP only.
The headers are located here at the bottom: https://github.com/ZacharyHampton/HomeHarvest/blob/master/homeharvest/core/scrapers/zillow/__init__.py
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36
It may be the cookie as well. When I made it, if I didn't have the cookie, the requests wouldn't work. Any insights @ZacharyHampton
It's def the cookie. I changed it to an updated one from my browser and it's good now on every request. The old cookie only worked on certain ips. Maybe time-based tho not sure if it's a long-term solution.
I believe we can fix this by fetching the cookies on an initial request & dynamically setting it to a fresh cookie every time for the backend endpionts
I believe we can fix this by fetching the cookies on an initial request & dynamically setting it to a fresh cookie every time for the backend endpionts
Ah, perfect, that is the only thing I would have suggested as well. Thanks for the link above as well to init.py. Will also check the PR to see where fixes were made. Cheers
Should be working good, you can pip upgrade to get latest changes. Let me know
Yes, pulled newest changes and checked that last commit. I'm far from an expert here, but looks good to me. Also checked and working again. Thanks for the help!
Python 3.10.11 Versions tested: 0.2.13
What I tried to do:
Output:
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://www.zillow.com/homes/for_rent/85281_rb/
Testing this URL in browser works OK.