David-Carrasco / Scrapy-Idealista

Scrapping data from Real Estate site www.idealista.com
GNU General Public License v2.0
158 stars 62 forks source link

Still works? #7

Closed staskbieto closed 4 years ago

staskbieto commented 4 years ago

When I run the script it seems that idealista is unreachable and it always respond with a 403 Forbbiden.

It seems that the owners have changed something in the secutiry aspect....

DanielRD92 commented 4 years ago

I tried it some minutes ago and the crawling part works. Here you will find the answer #3 , spolier: you must set the timeout in 20 secs, but I read some people get results with 4 seconds. Although I'm having some issues with the text encoding now.

David-Carrasco commented 4 years ago

@DanielRD92, @staskbieto pull the repo and try again.

The text encoding is fixed too.

I've tested the crawler with a timeout of 10 seconds and it works. If it can be a lower value, please let me know since I haven't been able to download a complete url with just 4 seconds, proxies fail from time to time.

Enjoy!

DanielRD92 commented 4 years ago

Now it works perfectly! Thank you.

David-Carrasco commented 4 years ago

Fixed