dat-ecosystem-archive / datBase

Open data sharing powered by Dat [ DEPRECATED - More info on active projects and modules at https://dat-ecosystem.org/ ]
http://datbase.org
244 stars 32 forks source link

add devops monitoring #299

Closed juliangruber closed 7 years ago

juliangruber commented 8 years ago

I propose using https://apex.sh/ping/. It's built by TJ, has great features and a very focused interface, and will cost us 9-30$ a month, which isn't much at all. It also supports free public status pages, like for example http://status.apex.sh/.

The interface looks like this:

It has email and slack integrations for alerts and will send out weekly alerts.

juliangruber commented 8 years ago
screen shot 2016-09-29 at 17 23 47
okdistribute commented 8 years ago

sounds good!

okdistribute commented 8 years ago

Whoever spends money on this kind of thing can get reimbursed via the grant, either one time for a yearly subscription or every month. It would be a 'supplies & materials' reimbursement. We have about $3,000 of this part of the budget left.

juliangruber commented 8 years ago

Another advantage of ping style monitoring is that this stat is very important for SEO, but also gives us a good measure of the overall health of the system.

juliangruber commented 8 years ago

Set up apex ping for us: http://status.ping.apex.sh/174bde64-a407-4b2d-81e9-4fd8b04a3e0a

juliangruber commented 8 years ago

We can also add more services for checks, like the signalhubs! cc @mafintosh

okdistribute commented 8 years ago

It isn't properly reporting failure with a 504 timeout. It says 100% uptime even though dat.land has crashed with 504 a few times in the past week.

juliangruber commented 8 years ago

@karissa it has reported that, see

screen shot 2016-10-11 at 17 09 41

that's why i disabled the check for now, until we expect uptime.

Does that make sense? Should I enable it again?

okdistribute commented 8 years ago

ohh okay, no that's cool if its disabled for now

On Tue, Oct 11, 2016 at 5:10 PM, Julian Gruber notifications@github.com wrote:

@karissa https://github.com/karissa it has reported that, see

[image: screen shot 2016-10-11 at 17 09 41] https://cloud.githubusercontent.com/assets/10247/19276011/892530b0-8fd5-11e6-91cf-3ce9943100cf.png

that's why i disabled the check for now, until we expect uptime.

Does that make sense? Should I enable it again?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/datproject/dat.land/issues/299#issuecomment-252946097, or mute the thread https://github.com/notifications/unsubscribe-auth/AAmotGRYnlsWooegzM84wxR1MjFLnYqTks5qy6bSgaJpZM4KKErl .

Karissa McKelvey http://karissa.github.io http://codeforscience.org http://dat-data.com