datahubio / datahub-v2-pm

Project management (issues only)
8 stars 2 forks source link

Get alerts when any something unexpected happens #235

Closed zelima closed 5 years ago

zelima commented 5 years ago

Originally coming from https://github.com/datahq/datahub-qa/issues/158

As a project manager, I want to be notified when something goes wrong Eg: service is down, service not responding as expected, processing data takes too long so that I'm aware problem exists and quickly make actions to fix

Acceptance Criteria

Tasks

Analysis

Services:

Also Things to check: Speed of the push and success

How should we monitor

We have to options:

  1. Create a module and run scripts on Travis daily and save the outcome as a dataset on the gihub. Travis will take care of sending Emails if build fails
    • We could even not save anything and just look at the travis build history. But think saving data with error messages would be useful
  2. Create a module as a service and serve an endpoint: api.datahub.io/health/check with GET and POST methods. One would trigger and run the checks and other just display history.

While 2nd options looks better and cleaner (and maybe useful in future) it will require additional day or 2 work while other should not take more then 5-6 hours. To setup server, setup continuous deploy etc..

So I think better and faster way would be option 1. That will one once in a day and send Emails of buid will fail. This way we can react on the problem within max 24 hours.

What should we check

We should create (if not already exists) the test users with several test datasets pushed on datahub. + give limitations

Specstore

Auth

Filemanager

Bitstore

Resolver

Metastore

Frontend:

plans

zelima commented 5 years ago

FIXED for now just getting Emails - no CSV