Open fdiotalevi opened 3 weeks ago
Thanks for bringing this up. It seems Chrono24 has implemented Cloudflare’s anti-scraping feature to block non-browser requests. If the issue persists, I can explore using Selenium as a workaround. While it could be effective, it may be more cumbersome to use.
Makes sense. Is there a workaround to be able to use the library?
Not at the moment, unfortunately. If you are able to get a hold of the HTML content, you should be able to fetch listings using this private method
import chrono24
# listing_html is your beautifulsoup4 object
standard_listing_dict = chrono24.query._get_standard_listing_as_json(listing_html)
detailed_listing_dict = chrono24.query._get_detailed_listing_as_json(listing_html)
Thanks to @davidiola for suggesting a potential solution to the Cloudflare problems.
Use FlareSolverr, an open-source proxy. Spin up the Docker container as described in their documentation, route your requests through it, and retrieve the relevant HTML.
I haven’t tested this yet. It may work as a stopgap. A more permanent fix is under consideration.
If anyone tries it, post your feedback here.
I have been using the library for a few months without issues, but since a week ago I can't anymore. Even the test script in the README will not work
How to reproduce