flathunters / flathunter

A bot to help people with their rental real-estate search. 🏠🤖
GNU Affero General Public License v3.0
831 stars 179 forks source link

AttributeError: 'NoneType' object has no attribute 'get' #193

Closed HacisKenan closed 2 years ago

HacisKenan commented 2 years ago

When trying to run flathunt.py I get the following Error:

python flathunt.py [2022/08/02 19:53:01|config.py |INFO ]: Using config /Users/kenanhacisalihoglu/Downloads/flathunter-main/config.yaml Traceback (most recent call last): File "/Users/kenanhacisalihoglu/Downloads/flathunter-main/flathunt.py", line 105, in main() File "/Users/kenanhacisalihoglu/Downloads/flathunter-main/flathunt.py", line 85, in main if not config.get('telegram', {}).get('bot_token'): AttributeError: 'NoneType' object has no attribute 'get'

My config File:

# Enable verbose mode (print DEBUG log messages)
# verbose: true

# Should the bot endlessly looop through the URLs?
# Between each loop it waits for <sleeping_time> seconds.
# Note that Ebay will (temporarily) block your IP if you
# poll too often - don't lower this below 600 seconds if you
# are crawling Ebay.
loop:
    active: yes
    sleeping_time: 600

# Location of the Database to store already seen offerings
# Defaults to the current directory
database_location: /path/to/database

# List the URLs containing your filter properties below.
# Currently supported services: www.immobilienscout24.de,
# www.immowelt.de, www.wg-gesucht.de, and www.ebay-kleinanzeigen.de.
# List the URLs in the following format:
# urls:
#   - https://www.immobilienscout24.de/Suche/...
#   - https://www.wg-gesucht.de/...

urls:
- https://www.wg-gesucht.de/wg-zimmer-und-1-zimmer-wohnungen-in-Munchen.90.0+1.1.0.html?offer_filter=1&city_id=90&noDeact=1&categories%5B%5D=0&categories%5B%5D=1&rent_types%5B%5D=2&sMin=10&rMax=650
- https://www.immowelt.de/liste/muenchen-schwabing-freimann/wohnungen/mieten?ami=10&d=true&pma=650&r=10&sd=DESC&sf=RELEVANCE&sp=1
- https://www.ebay-kleinanzeigen.de/s-auf-zeit-wg/muenchen/preis::650/c199l6411+auf_zeit_wg.qm_d:10%2C
- https://www.ebay-kleinanzeigen.de/s-wohnung-mieten/schwabing-freimann/preis::650/c203l16376r5+wohnung_mieten.qm_d:10%2C

# Define filters to exclude flats that don't meet your critera.
# Supported filters include 'max_rooms', 'min_rooms', 'max_size', 'min_size',
#   'max_price', 'min_price', and 'excluded_titles'.
#
# 'excluded_titles' takes a list of regex patterns that match against
# the title of the flat. Any matching titles will be excluded.
# More to Python regex here: https://docs.python.org/3/library/re.html
#
# Example:
# filters:
#   excluded_titles:
#     - "wg"
#     - "zwischenmiete"
#   min_price: 700
#   max_price: 1000
#   min_size: 50
#   max_size: 80
#   max_price_per_square: 1000
filters:
   excluded_titles:
     - "befristet"
     - "zwischenmiete"

# There are often city districts in the address which
# Google Maps does not like. Use this blacklist to remove
# districts from the search.
blacklist:

# If an expose includes an address, the bot is capable of
# displaying the distance and time to travel (duration) to
# some configured other addresses, for specific kinds of
# travel.
#  
# Available kinds of travel ('gm_id') can be found in the
# Google Maps API documentation, but basically there are:
#   - "bicycling"
#   - "transit" (public transport)
#   - "driving"
#   - "walking"
# 
# The example configuration below includes a place for
# "John", located at the main train station of munich.
# Two kinds of travel (bicycle and transit) are requested,
# each with a different label. Furthermore a place for
# "Jane" is included, located at the given destination and
# with the same kinds of travel.
durations:
    - name: xy
      destination: weg 35, München
      modes: 
          - gm_id: transit
            title: "Öff."
          - gm_id: bicycling
            title: "Rad"
    - name: Uni
      destination: straße 34, München
      modes: 
          - gm_id: transit
            title: "Öff."
          - gm_id: bicycling
            title: "Rad"

# Multiline message (yes, the | is supposed to be there), 
# to format the message received from the Telegram bot. 
# 
# Available placeholders:
#   - {title}: The title of the expose
#   - {rooms}: Number of rooms
#   - {price}: Price for the flat
#   - {durations}: Durations calculated by GMaps, see above
#   - {url}: URL to the expose
message: |
    {title}
    Zimmer: {rooms}
    Größe: {size}
    Preis: {price}
    Ort: {address}

    {url}

# Calculating durations requires access to the Google Maps API. 
# Below you can configure the URL to access the API, with placeholders.
# The URL should most probably just kept like that. 
# To use the Google Maps API, an API key is required. You can obtain one
# without costs from the Google App Console (just google for it).
# Additionally, to enable the API calls in the code, set the 'enable' key to True
google_maps_api:
    key: Axxxxxxxxxxxxxxxxxxxxxxxx
    url: https://maps.googleapis.com/maps/api/distancematrix/json?origins={origin}&destinations={dest}&mode={mode}&sensor=true&key={key}&arrival_time={arrival}
    enable: True

# If you are planning to scrape immoscout24.de, the bot will need 
# to circumvent the sites captcha protection by using a captcha 
# solving service. Register at either imagetypers or 2captcha 
# (the former is prefered), desposit some funds, uncomment the 
# corresponding lines below and replace your API key/token.
# Use driver_arguments to provide options for Chrome WebDriver.
# captcha:
#       imagetyperz:
#             token: alskdjaskldjfklj
#       2captcha:
#             api_key: alskdjaskldjfklj
#       driver_arguments:
#         - "--headless"

# You can select whether to be notified by telegram or via a mattermost
# webhook. For all notifiers selected here a configuration must be provided
# below.
# notifiers:
#   - telegram
notifiers:
    - telegram

# Sending messages using Telegram requires a Telegram Bot configured. 
# Telegram.org offers a good documentation about how to create a bot.
# Once you read it, will make sense. Still: bot_token should hold the
# access token of your bot and receiver_ids should list the client ids
# of receivers. Note that those receivers are required to already have
# started a conversation with your bot. 
#
telegram:
bot_token: 5xxxxxxxxxxxxxxxxxxxx
receiver_ids:
- 5xxxxxxxxxxx
- 5xxxxxxxxxxx

# Sending messages via mattermost requires a webhook url provided by a
# mattermost server. You can find a description how to set up a webhook with
# the official mattermost documentation:
# https://docs.mattermost.com/developer/webhooks-incoming.html
# mattermost:
#   webhook_url: https://mattermost.example.com/signup_user_complete/?id=abcdef12356

# If you are running the web interface, you can configure Login with Telegram support
# Follow the instructions here to register your domain with the Telegram bot:
# https://core.telegram.org/widgets/login
#
# website:
#    bot_name: bot_name_xxx
#    domain: flathunter.example.com
#    session_key: SomeSecretValue
#    listen:
#      host: 127.0.0.1
#      port: 8080

# If you are deploying to google cloud,
# uncomment this and set it to your project id. More info in the readme.
# google_cloud_project_id: 

# For websites like idealista.it, there are anti-crawler measures that can be
# circumvented using proxies.
use_proxy_list: True

The pytest returns no errors besides test_web_interface.py and test_crawl_immosbiliencout.py (because of No Captcha and Weblink I think). Where could the problem be?

alexanderroidl commented 2 years ago

You have no indentation for Telegram's configuration, which the YAML format depends on for parsing:

❌ No indentation: ❌

telegram:
bot_token: 5xxxxxxxxxxxxxxxxxxxx
receiver_ids:
- 5xxxxxxxxxxx
- 5xxxxxxxxxxx

✅ Correctly intended: ✅

telegram:
  bot_token: 5xxxxxxxxxxxxxxxxxxxx
  receiver_ids:
    - 5xxxxxxxxxxx
    - 5xxxxxxxxxxx
HacisKenan commented 2 years ago

Thanks for answering, that resolved that issue but a new one occurred. Following:

(flathunter-main) kenanhacisalihoglu@MacBook-Pro-von-Kenan flathunter-main % python flathunt.py [2022/08/05 21:18:23|config.py |INFO ]: Using config /Users/kenanhacisalihoglu/Downloads/flathunter-main/config.yaml [2022/08/05 21:18:24|crawl_wggesucht.py |WARNING ]: No size found - skipping [2022/08/05 21:18:25|idmaintainer.py |ERROR ]: Error unable to open database file: Traceback (most recent call last): File "/Users/kenanhacisalihoglu/Downloads/flathunter-main/flathunt.py", line 105, in main() File "/Users/kenanhacisalihoglu/Downloads/flathunter-main/flathunt.py", line 101, in main launch_flat_hunt(config, heartbeat) File "/Users/kenanhacisalihoglu/Downloads/flathunter-main/flathunt.py", line 31, in launch_flat_hunt hunter.hunt_flats() File "/Users/kenanhacisalihoglu/Downloads/flathunter-main/flathunter/hunter.py", line 54, in hunt_flats for expose in processor_chain.process(self.crawl_for_exposes(max_pages)): File "/Users/kenanhacisalihoglu/Downloads/flathunter-main/flathunter/idmaintainer.py", line 25, in process_expose self.id_watch.save_expose(expose) File "/Users/kenanhacisalihoglu/Downloads/flathunter-main/flathunter/idmaintainer.py", line 85, in save_expose cur = self.get_connection().cursor() File "/Users/kenanhacisalihoglu/Downloads/flathunter-main/flathunter/idmaintainer.py", line 65, in get_connection raise error File "/Users/kenanhacisalihoglu/Downloads/flathunter-main/flathunter/idmaintainer.py", line 53, in get_connection self.threadlocal.connection = lite.connect(self.db_name) sqlite3.OperationalError: unable to open database file (flathunter-main) kenanhacisalihoglu@MacBook-Pro-von-Kenan flathunter-main %

alexanderroidl commented 2 years ago

You have uncommented line 16 of the config file and did not yet replace the placeholder /path/to/database. This is why Flathunter can not find and therefore open your database.

Comment the line again or change the value to an existing directory to make it work! 😊