flathunters / flathunter

A bot to help people with their rental real-estate search. 🏠🤖
GNU Affero General Public License v3.0
851 stars 182 forks source link

data-imgsrc KeyError when crawling ebay-kleinanzeigen #111

Closed marvin-richter closed 3 years ago

marvin-richter commented 3 years ago

Hi! I get an error when crawling kleinanzeigen after days with no problem. Did they change something? Here is the error message:

Traceback (most recent call last):
  File "flathunt.py", line 89, in <module>
    main()
  File "flathunt.py", line 86, in main
    launch_flat_hunt(config)
  File "flathunt.py", line 46, in launch_flat_hunt
    hunter.hunt_flats()
  File "/home/m/flathunter/flathunter/hunter.py", line 42, in hunt_flats
    for expose in processor_chain.process(self.crawl_for_exposes(max_pages)):
  File "/home/m/flathunter/flathunter/hunter.py", line 22, in crawl_for_exposes
    for searcher in self.config.searchers()
  File "/home/m/flathunter/flathunter/hunter.py", line 23, in <listcomp>
    for url in self.config.get('urls', list())])
  File "/home/m/flathunter/flathunter/abstract_crawler.py", line 136, in crawl
    return self.get_results(url, max_pages)
  File "/home/m/flathunter/flathunter/abstract_crawler.py", line 127, in get_results
    entries = self.extract_data(soup)
  File "/home/m/flathunter/flathunter/crawl_ebaykleinanzeigen.py", line 72, in extract_data
    image = image_element["data-imgsrc"]
  File "/home/m/.pyenv/versions/venv_flathunter/lib/python3.6/site-packages/bs4/element.py", line 992, in __getitem__
    return self.attrs[key]
KeyError: 'data-imgsrc'

that is my search-string:

https://www.ebay-kleinanzeigen.de/s-wohnung-mieten/mitte/anzeige:angebote/preis::800/c203l3518r5+wohnung_mieten.zimmer_d:2

Thanks for looking into it! Marvin

sbaghdadi commented 3 years ago

I can confirm this issue. It seems that ebay-kleinanzeigen change their html-markup on Sunday so that crawling now fails.

sbaghdadi commented 3 years ago

As a workaround you can comment the lines 71 - 74 and line 93 in flathunter/crawl_ebaykleinanzeigen.py as in the screenshot shown below. I don't need the image to be crawled because telegram already pick up the image when showing the the link.

image image

sbaghdadi commented 3 years ago

I took a look in the html-markup. Could somebody confirm that changing line 66 of flathunt/crawl_ebaykleinanzeigen.py to

image element = expose_ids[idx].find("div", {"class": "galleryimage-element"})

works? image

marvin-richter commented 3 years ago

I took a look in the html-markup. Could somebody confirm that changing line 66 of flathunt/crawl_ebaykleinanzeigen.py to

image element = expose ids[idx].find("div", {"class": "galleryimage-element"})

works?

yes it works. thank you!

there are underscores missing in your answer, so it's

image_element = expose_ids[idx].find("div", {"class": "galleryimage-element"})

sbaghdadi commented 3 years ago

Yes you're right, I was missing the underscore. I do a merge request! Thanks for testing.