hotosm / oam-status

A simple status dashboard for oam-catalog
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

This repo/service is overkill and redundant #13

Open tombh opened 7 years ago

tombh commented 7 years ago

Please correct me if I'm wrong but this repo only serves 2 conflicting purposes, both of which are problematically implemented.

  1. The status server here provides an undocumented /healthcheck JSON endpoint that is only used by oam-browser. Health is defined by pings to the oam-catalog. If the Catalog API is down then the browser frontend will not work anyway!

  2. The status website here directly queries via client AJAX the /analytics endpoint on the Catalog API. This uses the web client's internet connection which makes the status definition largely worthless - it is far more likely that the client's connection is to blame for a 'poor' healthcheck.

Ultimately the core function of a status page is to provide independent verification of a service's status. There already exist free services that do this, I would recommend: https://uptimerobot.com/ which also provides an API so we can report statuses from other sites. What's more such services already implement multiple redundancy, running on different platforms from many regions across the world.

smit1678 commented 7 years ago

The status server here provides an undocumented /healthcheck JSON endpoint that is only used by oam-browser. Health is defined by pings to the oam-catalog.

I may be wrong but I think health is also defined by New Relic, right?

Ultimately the core function of a status page is to provide independent verification of a service's status. There already exist free services that do this, I would recommend: https://uptimerobot.com/ which also provides an API so we can report statuses from other sites.

Can we still have a status page from these services that sits at a custom domain?

tombh commented 7 years ago

Oh yes, you're right, pings only provide binary up/down. Whereas the newRelicGetHelath() call here does indeed offer something slightly more fine-grained. So yes, that means that an orange status can still be loaded within the oam-browser. So my argument is not so strong. Though I would argue that this repo is still very much overkill for what it achieves. It still doesn't provide hosting on alternative infrastructure nor region redundancy. And the curiosity of having a completely separate healthcheck through HTML from the JSON endpoint still needs to be addressed.

And compared to the out-of-the-box functionality of https://uptimerobot.com/ including free custom domains, it's hard to justify the technical debt here.

tombh commented 7 years ago

Here's a working status page using uptimerobot.com https://stats.uptimerobot.com/WPByvFZ4Y It checks both the website and Catalog API every 5 minutes.