flathunters / flathunter

A bot to help people with their rental real-estate search. 🏠🤖
GNU Affero General Public License v3.0
835 stars 179 forks source link

Cloud run deployment doesn't start crawling #506

Open infctr opened 10 months ago

infctr commented 10 months ago

Hey! First of all many thanks for keeping this project updated and well alive!

I'm having trouble running a Google Cloud Run job on latest main. The job starts with the following config

Settings from config: {"captcha_enabled": false, "captcha_driver_arguments": 
["--no-sandbox", "--headless", "--disable-gpu", "--remote-debugging-port=9222", 
"--disable-dev-shm-usage", "--window-size=1024,768"], "captcha_solver": "NoneType", 
"imagetyperz_token": null, "twocaptcha_key": null, "mattermost_webhook_url": null, 
"notifiers": ["telegram"], "slack_webhook_url": "", "telegram_receiver_ids": [****], 
"telegram_bot_token": "580xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxzwU", 
"target_urls": ["https://kleinanzeigen.de/s-wohnung-mieten/****"], "use_proxy": false}

and then immediately exits with Container called exit(0).

I have added FLATHUNTER_VERBOSE_LOG=1 to env variables, but there are no additional log messages. What am I missing from setup?

codders commented 10 months ago

Hi @infctr,

No worries - happy to keep it ticking along :) I don't know exactly how your docker image is configured, but the line after configure_logging (which prints Settings from config:) is init_searchers which initialises the crawlers. My guess would be that the initialisation of the Immobilienscout crawler triggers Chrome-related code (downloads the undetected-chromedriver, tries to connect to the browser), and that causes a crash.

In your config, you have "captcha_enabled" as false, but you anyway supply "captcha_driver_arguments". If you're not crawling immoscount, maybe drop the "captcha_driver_arguments" entirely. And if you comment out the Immobilienscout initialisation (https://github.com/flathunters/flathunter/blob/d3a2002c9684aade9d4e8e5d6bec599d75957315/flathunter/config.py#L121), you might find that it just starts normally. That would be a good hint for further debugging.