flathunters / flathunter

A bot to help people with their rental real-estate search. 🏠🤖
GNU Affero General Public License v3.0
834 stars 179 forks source link

Index error for immobilienscout #53

Closed JonasHamHam closed 4 years ago

JonasHamHam commented 4 years ago

Hello, the verbose mode tells me that I get an index error for immobilienscout. I am not that versatile with coding and can't fix this myself. Can anyone help me?

[2020/08/19 12:44:23|flathunt.py |DEBUG ]: Settings from config: <flathunter.config.Config object at 0x10a500e10> [2020/08/19 12:44:23|crawl_immobilienscout.py|DEBUG ]: Got search URL https://www.immobilienscout24.de/Suche/shape/wohnung-mieten?shape=aWB3ZEhpZWVlQWZ4QGlhQGR7QHllQXp2QGtoQ3pBc2VAeWJAd3ZAbUBzZUB0b0BlZkNvaUF7YEltdUB9eEBxfEBkUGNxQGx7QmdaYF1xfEB6S298QGJoQnt0QWRQbUpkbEF_Um5gR2JGeGdAZlp4Y0JqdUBoakJobEFfTg..&numberofrooms=2.0-&price=-1300.0&livingspace=40.0-&sorting=2&pagenumber={0} 2020/08/19 12:44:24|crawl_immobilienscout.py|DEBUG : Index Error occurred

2020/08/19 12:44:24|crawl_immobilienscout.py|DEBUG : extracted: 0

I think the issue is the provided search URL. If I use it in a browser, it does not lead anywhere. The '&pagenumber={0}' part at the end comes from the crawl_immobilienscout.py file, the URL in my config does not have this part. I tried a few things but all I got where errors.

Thanks guys!

sourcloud commented 4 years ago

I think the curly braces around the 0 at the '&pagenumber={0}' are causing the problem. If I remove them, I can open your link without any problem.

My guess is that those are some regular expression or copy/paste leftovers. I cannot change and test for myself right now, but maybe somebody can further investigate lines 25 to 29 of 'crawl_immobilienscout.py'

codders commented 4 years ago

There's a few people having trouble with Immoscout right now because of new blocking / filtering they have added (see issue #45). Probably the index error comes when you hit the Captcha on that page - the curly braces and zero are interpolated with the call to format on line 62 of crawl_immobilienscout.py, so that shouldn't be the problem.

Are you using the latest code from the main branch? That has some improvements for the immoscout filtering. We don't have a 100% solution so far though - lots of us are still getting errors on ImmoScout.

codders commented 4 years ago

We need more information to say what's going on here. I believe the other changes to the ImmoScout crawler may have fixed this.

vitormalencar commented 3 years ago

I have the same problem, just started to use the bot, but no results since I'm only looking on immobilienscout