flathunters / flathunter

A bot to help people with their rental real-estate search. 🏠🤖
GNU Affero General Public License v3.0
852 stars 182 forks source link

KeyError: 'monthlyRate' on ImmoScout #85

Closed herbertgroff closed 4 years ago

herbertgroff commented 4 years ago

Hi, first of all: thanks for writing this awesome bot!!

It was running yesterday, but seems to crash now with following error on URL

https://www.immobilienscout24.de/Suche/de/berlin/berlin/wohnung-mieten?numberofrooms=4.0-&price=-1500.0&livingspace=100.0-&geocodes=110000000911,110000000801,110000000703,110000000605,110000000704,110000000906,110000000907,110000001102,110000000201,110000000202,110000000301,110000000302,110000000601,110000000910,110000000701&pagenumber={0}

error log

[2020/10/19 16:34:48|config.py         |INFO    ]: Using config /Users/user/flathunter/config.yaml
[2020/10/19 16:34:50|flathunt.py       |DEBUG   ]: Settings from config: <flathunter.config.Config object at 0x108c51f10>
[2020/10/19 16:34:50|crawl_immobilienscout.py|DEBUG   ]: Got search URL https://www.immobilienscout24.de/Suche/de/berlin/berlin/wohnung-mieten?numberofrooms=4.0-&price=-1500.0&livingspace=100.0-&geocodes=110000000911,110000000801,110000000703,110000000605,110000000704,110000000906,110000000907,110000001102,110000000201,110000000202,110000000301,110000000302,110000000601,110000000910,110000000701&pagenumber={0}
[2020/10/19 16:34:51|abstract_crawler.py|DEBUG   ]: Google site key: <re.Match object; span=(49, 93), match='&k=6LeaILIZAAAAALTgLZV1AQXPc2dAsLItNYJ8jVvB&'>
[2020/10/19 16:34:56|abstract_crawler.py|DEBUG   ]: Captcha status: CAPCHA_NOT_READY
[2020/10/19 16:35:01|abstract_crawler.py|DEBUG   ]: Captcha status: CAPCHA_NOT_READY
[2020/10/19 16:35:06|abstract_crawler.py|DEBUG   ]: Captcha status: CAPCHA_NOT_READY
[2020/10/19 16:35:12|abstract_crawler.py|DEBUG   ]: Captcha status: CAPCHA_NOT_READY
[2020/10/19 16:35:17|abstract_crawler.py|DEBUG   ]: Captcha status: CAPCHA_NOT_READY
[2020/10/19 16:35:17|abstract_crawler.py|DEBUG   ]: Captcha promise: OK|03AGdBq26K-M4biKiyM1LweSOVKS1UuwZouTmow2O7P0f4P7yslyST6Fr7D1qwuHOWd63NU6GG_oQND0Vd1X0Z7MYlH8LO29WaHhgfxyPcoGo19TyERKtQZBxh0ktiSzWuuFs07dHyOw6sNKFKZQt3X1cDv5xJnqnEugqgIY26ZqVSg5zJvAQdEr1wIvaPTehCOQh-4Uh910LK7EnFzrIdc5qRnWVFdQ5RHuMw1sCCjUNTB_jhgCHax-oxG_ec33AMiXm_cMW-HtVnAcQ01ESpBMJe3Cjjhwd77BpbuWUmP3TQTIEObBTe_C3DMnIn_xVeH1B_yw8F1SCsLm_Eh43-4SQjsQZeJhem_odU1RdvVm8E3Os2YAkrEl1c7jOMY9NRcMr-i_kElZmppQjE5Ps1FhHMgd-NzaTwV5bTNmMhKh0W9I2XTP00eHd8FFbJBvusqSkKQmf-OXqMVI6YYS0dR1SOILgiYNsx8u-ppzAzLl2klKo_9DLdPRZY3ctk-MfmpODwEouRLIAaol28pxnRp8trERm2komELw
Traceback (most recent call last):
  File "flathunt.py", line 89, in <module>
    main()
  File "flathunt.py", line 86, in main
    launch_flat_hunt(config)
  File "flathunt.py", line 46, in launch_flat_hunt
    hunter.hunt_flats()
  File "/Users/user/flathunter/flathunter/hunter.py", line 42, in hunt_flats
    for expose in processor_chain.process(self.crawl_for_exposes(max_pages)):
  File "/Users/user/flathunter/flathunter/hunter.py", line 22, in crawl_for_exposes
    for searcher in self.config.searchers()
  File "/Users/user/flathunter/flathunter/hunter.py", line 23, in <listcomp>
    for url in self.config.get('urls', list())])
  File "/Users/user/flathunter/flathunter/abstract_crawler.py", line 136, in crawl
    return self.get_results(url, max_pages)
  File "/Users/user/flathunter/flathunter/crawl_immobilienscout.py", line 64, in get_results
    return self.get_entries_from_javascript()
  File "/Users/user/flathunter/flathunter/crawl_immobilienscout.py", line 98, in get_entries_from_javascript
    return self.get_entries_from_json(result_json)
  File "/Users/user/flathunter/flathunter/crawl_immobilienscout.py", line 102, in get_entries_from_json
    return [ self.extract_entry_from_javascript(entry.value) for entry in jsonpath_expr.find(json) ]
  File "/Users/user/flathunter/flathunter/crawl_immobilienscout.py", line 102, in <listcomp>
    return [ self.extract_entry_from_javascript(entry.value) for entry in jsonpath_expr.find(json) ]
  File "/Users/user/flathunter/flathunter/crawl_immobilienscout.py", line 113, in extract_entry_from_javascript
    'price': str(entry["monthlyRate"]),
KeyError: 'monthlyRate'

Any idea? Thanks!

codders commented 4 years ago

Yeah. So this is coming from this line of code:

https://github.com/flathunters/flathunter/blob/be42c985582f6b8850eb5c5a4f32ff1eab6ecf82/flathunter/crawl_immobilienscout.py#L113

Looks like there must be some exposes on the page you're looking at that don't have a 'monthlyPrice' value. What I see in the elements on that page is:

@id: "65109297"
@xsi.type: "search:ApartmentRent"
address: {street: "Klingenburger Str.", houseNumber: "2d", postcode: "12555", city: "Berlin", quarter: "Köpenick (Köpenick)", …}
balcony: "true"
builtInKitchen: "true"
calculatedPrice: {value: 1690, currency: "EUR", marketingType: "BUDGET_RENT", priceIntervalType: "MONTH", rentScope: "WARM_RENT"}
companyWideCustomerId: "001.3112839"
contactDetails: {salutation: "NO_SALUTATION", firstname: "Hausverwaltung", lastname: "Rösler", phoneNumber: "030 56293380", company: "Rösler Hausverwaltung"}
energyPerformanceCertificate: "true"
floorplan: "true"
galleryAttachments: {attachment: Array(8)}
garden: "true"
listingType: "L"
livingSpace: 139
numberOfRooms: 5
price: {value: 1390, currency: "EUR", marketingType: "RENT", priceIntervalType: "MONTH"}
privateOffer: "false"
realtorCompanyName: "Rösler Hausverwaltung"
spotlightListing: "false"
streamingVideo: "false"
title: "3 Zimmer +2 Hobbyräume Maisonette Wohnung in Köpenick"
titlePicture: {@xlink.href: "https://pictures.immobilienscout24.de/listings/4db…size/60x60%3E/extent/60x60/format/webp/quality/50", @id: "65109297", @modification: "2020-10-19T18:22:09.499+02:00", @creation: "2020-10-19T18:22:09.499+02:00", @publishDate: "2020-10-19T18:22:09.499+02:00", …}
virtualTour: {url: "https://virtualtours.immobilienscout24.de/portal/tour/1131082", previewUrls: {…}}
virtualTourAvailable: "true"

So the monthlyPrice property is indeed missing. It might be that ["price"]["value"] is a better source for that data. How are your coding skills? You want to try changing that?

codders commented 4 years ago

That should be fixed now in #86. Can you pull the latest code and retry?

jossiamsee commented 4 years ago

had the same issue this afternoon, your fix did the job. thanks!

herbertgroff commented 4 years ago

Works, wunderbar, thanks!