flathunters / flathunter

A bot to help people with their rental real-estate search. 🏠🤖
GNU Affero General Public License v3.0
831 stars 179 forks source link

gcloud deployment issue #30

Closed jacobsletter closed 4 years ago

jacobsletter commented 4 years ago

Hello,

I have problems with the final step

$ gcloud app deploy cron.yaml

I get the error message:

ERROR: (gcloud.app.deploy) An error occurred while parsing file: [/Users/myname/Desktop/flathunter-main/cron.yaml] Unexpected attribute 'loop' for object of type CronInfoExternal. in "/Users/myname/Desktop/flathunter-main/cron.yaml", line 9, column 5

What could be the reason?

Many thanks in advance!

codders commented 4 years ago

I would have to see your cron.yaml, but it sounds like you might have accidentally copied your config.yaml over your cron.yaml. Can you compare it to the version on Github?

jacobsletter commented 4 years ago

Yeah, I accidentally copied config.yaml.dist to cron! I don't quite get the step

Before running the project for the first time, copy config.yaml.dist to config.yaml.

Does it mean I just create a config.yaml and paste the contents from config.yaml.dist into this file? Could you maybe explain it a little more detailed?

codders commented 4 years ago

Cool. Happy that's working for you.

Yes - you need to copy config.yaml.dist to config.yaml, and then make changes to suit your setup. So for example, you'll need to put some entries in the 'urls' section of the config file, and fill in the details of your Telegram bot. If you want to use the login and registration functions of the website, you need to fill in the 'website' section of the config file too. Does that make sense?

jacobsletter commented 4 years ago

I don't know how to create/copy config.yaml - how do I do this? Do I just create a new file? Do I have to type something in the Terminal? Because I don't see an existing file. I have my URLs and Telegram details, I just don't know about config.yaml. Thanks for your help!

codders commented 4 years ago

You just need to open config.yaml.dist in a text editor, and then 'Save As...' config.yaml in the same folder. You can use the text editor to make the changes you need.

jacobsletter commented 4 years ago

Thank you! Does it matter where exactly I put the google_cloud_project_id?

codders commented 4 years ago

No - it just needs to be in a line on its own somwhere:

google_cloud_project_id: myproject

jacobsletter commented 4 years ago

Thank you! Now I get that error after trying to deploy:

One or more errors occurred: MaxRetrialsException: last_result=(None, (<class 'httplib2.python3.httplib2.ServerNotFoundError'>, ServerNotFoundError('Unable to find the server at storage.googleapis.com'), <traceback object at 0x1114c1640>)), last_retrial=3, time_passed_ms=113126,time_to_wait=0 OperationalError: unable to open database file

What might be causing this? Might it be connected to this in config.yaml?

# Location of the Database to store already seen offerings
# Defaults to the current directory
#database_location: /path/to/database
database_location: /path/to/database

I changed /path/to/database to the folder where the config.yaml is located.

codders commented 4 years ago

If you're deploying to the cloud, you won't need the 'database_location' setting - that's just for when you're running locally. It says it can't find the database server - is Google Cloud Firestore enabled in the Google Admin console for your project?

jacobsletter commented 4 years ago

Yes, when I go to the Datastore, it says 'You’re using Cloud Firestore in Native mode'.

jacobsletter commented 4 years ago

Do I need credentials?

codders commented 4 years ago

Nope. They're automatically there. When I search google for that error message, I see other people on other projects with the same issue. It seems like it might be a problem with the libraries or with Google's servers. But it's hard for me to debug from here.

https://github.com/GoogleCloudPlatform/gsutil/issues/498

I suggest you try again later - maybe the google servers will be responding better. Also, make sure the project is configured to deploy to Europe: https://cloud.google.com/compute/docs/regions-zones/changing-default-zone-region https://cloud.google.com/sdk/gcloud/reference/config/set

jacobsletter commented 4 years ago

Thanks! There was no set default zone region, however after setting a zone, it still didn't work. I did an OS update, started from scratch and changed my dns: Surprise, no error while deploying! How can I make sure that it worked? I got no Telegram message so far but as it's Friday evening, I don't expect new listings. The status of the cron job one the Google console says it's failed so I'm bit pessimistic.

EDIT When I checked the Immoscout link, I saw that new listings are available. The cron job status is successful.

codders commented 4 years ago

Hi again,

I guess you're the first person besides me to try and run the webserver, so maybe the instructions are not so good. One thing is to check the logs in the Google Logs console to see if there are any errors there. Another thing is to look into the database - you should see records created there for exposes that the system is finding.

If you see exposes but you don't get messages, there is something wrong with the Telegram setup. If you see no exposes, there is something wrong with the crawling setup. Let me know which is the case.

The other thing you can try to do is run the system locally (python flathunter.py) with the same config to see if it works without Google Cloud - that might show up if there are other configuration problems.

Hope that helps!

jacobsletter commented 4 years ago

Hi Arthur,

with database you mean processed_ids.db, right? No, there weren't any listings. After I ranpython flathunt.py, they got created. I also the exposes, executions and processed in Google > Firestore > Data. python flathunt.py worked flawlessly - I got notified by my Telegram bot. So outside of Google Cloud it seems to work.

Those are the to errors I have:

Traceback (most recent call last): File "/env/lib/python3.7/site-packages/flask/app.py", line 2447, in wsgi_app response = self.full_dispatch_request() File "/env/lib/python3.7/site-packages/flask/app.py", line 1952, in full_dispatch_request rv = self.handle_user_exception(e) File "/env/lib/python3.7/site-packages/flask/app.py", line 1821, in handle_user_exception reraise(exc_type, exc_value, tb) File "/env/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise raise value File "/env/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request rv = self.dispatch_request() File "/env/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request return self.view_functionsrule.endpoint File "/srv/flathunter/web/views.py", line 122, in hunt hunter.hunt_flats() File "/srv/flathunter/web_hunter.py", line 30, in hunt_flats for expose in processor_chain.process(self.crawl_for_exposes(max_pages=max_pages)): File "/srv/flathunter/idmaintainer.py", line 25, in process_expose self.id_watch.save_expose(expose) File "/srv/flathunter/googlecloud_idmaintainer.py", line 40, in save_expose self.database.collection(u'exposes').document(str(expose[u'id'])).set(record) File "/env/lib/python3.7/site-packages/google/cloud/firestore_v1/document.py", line 234, in set write_results = batch.commit() File "/env/lib/python3.7/site-packages/google/cloud/firestore_v1/batch.py", line 147, in commit metadata=self._client._rpc_metadata, File "/env/lib/python3.7/site-packages/google/cloud/firestore_v1/gapic/firestore_client.py", line 1033, in commit request, retry=retry, timeout=timeout, metadata=metadata File "/env/lib/python3.7/site-packages/google/api_core/gapic_v1/method.py", line 143, in call return wrapped_func(*args, *kwargs) File "/env/lib/python3.7/site-packages/google/api_core/retry.py", line 286, in retry_wrapped_func on_error=on_error, File "/env/lib/python3.7/site-packages/google/api_core/retry.py", line 184, in retry_target return target() File "/env/lib/python3.7/site-packages/google/api_core/timeout.py", line 214, in func_with_timeout return func(args, **kwargs) File "/env/lib/python3.7/site-packages/google/api_core/grpc_helpers.py", line 59, in error_remapped_callable six.raise_from(exceptions.from_grpc_error(exc), exc) File "", line 3, in raise_from google.api_core.exceptions.InternalServerError: 500 An internal error occurred.

and

Traceback (most recent call last): File "/env/lib/python3.7/site-packages/google/api_core/grpc_helpers.py", line 57, in error_remappedcallable return callable(*args, **kwargs) File "/env/lib/python3.7/site-packages/grpc/_channel.py", line 826, in call return _end_unary_response_blocking(state, call, False, None) File "/env/lib/python3.7/site-packages/grpc/_channel.py", line 729, in _end_unary_response_blocking raise _InactiveRpcError(state) grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:

They seemed to have happened right after/while deploying.

Thanks for your help! I'm a (python) newbie so I guess that's why I'm having so many issues. Let me know if you need more information!

Edit: I did some research and that user was apparently able to solve that issue by setting an environment. https://github.com/googleapis/python-firestore/issues/794 In which flathunter file would I need to do that?

codders commented 4 years ago

You could set os.environ['GRPC_DNS_RESOLVER'] = 'native' in main.py and see what happens. You will need to import os at the top of the file, and it would make sense only to set this just above the line id_watch = GoogleCloudIdMaintainer(). Let me know if that works!

jacobsletter commented 4 years ago

Unfortunately it doesn't. I see this error now in my log:

gunicorn.errors.HaltServer: <HaltServer 'Worker failed to boot.' 3> at reap_workers (/env/lib/python3.7/site-packages/gunicorn/arbiter.py:525) at handle_chld (/env/lib/python3.7/site-packages/gunicorn/arbiter.py:242) at kill_worker (/env/lib/python3.7/site-packages/gunicorn/arbiter.py:628) at kill_workers (/env/lib/python3.7/site-packages/gunicorn/arbiter.py:626) at stop (/env/lib/python3.7/site-packages/gunicorn/arbiter.py:390) at halt (/env/lib/python3.7/site-packages/gunicorn/arbiter.py:342) at run (/env/lib/python3.7/site-packages/gunicorn/arbiter.py:229) at run (/env/lib/python3.7/site-packages/gunicorn/app/base.py:72) at run (/env/lib/python3.7/site-packages/gunicorn/app/base.py:228) at run (/env/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py:58) at (/env/bin/gunicorn:10)>

Edit: I got rid off that issue above by installing gunicorn.

jacobsletter commented 4 years ago

I finally managed to deploy it!

I had some other error messages while doing it.

errror: argument --config/-c: can't open '/config/config.yaml': [Errno 2] No such file or directory: '/config/config.yaml' was an error I had while deploying, I solved it by changing the dockerfile from/config/config.yaml to config.yaml.

While I'm an absolute beginner in Python, I can happily help with improving any documentation. This makes flathunting a lot easier, I very much appreciate your help!

codders commented 4 years ago

Huh. That's interesting. I didn't think about deploying to google cloud using Docker. I was deploying directly to Google App Engine. It looks like the Dockerfile there was added by @jannickfahlbusch , and I guess he is doing a volume mount of /config to the current directory (e.g. --mount source=.,target=/config). We should add some docs for that.

It would be super if you wanted to document how you are using the Dockerfile in a Google Cloud deployment.

But I'm happy you got everything deployed. Thanks for the feedback, and happy flat hunting!