cstate / cstate

🔥 Open source static (serverless) status page. Uses hyperfast Go & Hugo, minimal HTML/CSS/JS, customizable, outstanding browser support (IE8+), preloaded CMS, read-only API, badges & more.
https://cstate.mnts.lt
MIT License
2.5k stars 228 forks source link

[Share your opinion!] New severity levels #172

Closed soinu closed 3 years ago

soinu commented 3 years ago

Is your feature request related to a problem? Please describe. The current severity levels are quite limited.

It is not possible to create entries which affect a service but do not create a textbox at the top, for example for not so relevant performance issues or just some other general announcement for a specific system. An unresolved notice severity issue is automatically tagged as maintenance. (Why?)

Describe the solution you'd like Restructure severity levels:

Describe alternatives you've considered

Add an additional toggle for showing the issue at the top in the textbox and a new performance-issues severity. This would ad support for some more cases and is probably easier to implement

mistermantas commented 3 years ago

An unresolved notice severity issue is automatically tagged as maintenance. (Why?)

Notice = maintenance or similar cases that don't affect the system now.

Show textbox at the top before and during maintenance period

This is a static site.

performance issues vs disrupted

What is the difference?


Why the complexity? I'm personally against performance issues as a severity for example.

I guess not showing the issue at the top could be an option, but why? Why would you not want to announce something?

It just seems like something that wouldn't help your customer.

soinu commented 3 years ago

This is a static site.

Updating the maintenance with resolved: true would mean maintenance is finished and the textbox will be removed (like it happens with notice right now, just wanted to explain it)

performance issues vs disrupted

What is the difference?

A performance issue means there might be some short hiccups. A disruption is an actual malfunction, something is really broken but the system is not completely down.

I guess not showing the issue at the top could be an option, but why? Why would you not want to announce something?

The announcement should happen, but not at the top and not forever while the issue is unresolved. It is not possible to hide the textbox without resolving the issue. The issue / announcement is still visible in the RSS feed and on /affected/system123. Also some performance issue text (e.g. in blue), like other status page systems have, would be shown next to the affected system.

mistermantas commented 3 years ago

I think most of this is too complex.

Keeping this issue open, as I'm open to ideas, for a little bit to see if other people support the idea or not.

soinu commented 3 years ago

Can you explain what exactly would be too complex? The implementation? The "end user" issue creation? Do you just don't like too many toggles (no hard feelings)? I think the explained examples are valid use cases, I have no idea why nobody else requested this yet.

mistermantas commented 3 years ago

Cachet and I suppose most other big status pages don't have such granual severities because they're supposed to be all encompassing

https://docs.cachethq.io/docs/incident-statuses

mistermantas commented 3 years ago

Sure

the implementation?

I'd love to write less code but if there's a need then you have to do that

The "end user" issue creation? Do you just don't like too many toggles (no hard feelings)?

Picture a burning server. Do you want to switch toggles or get to writing in the content box what's happening? I don't think admins want to think about the difference between disrupted and performance issues. I literally don't know the difference and neither will a customer of a SaaS app. If it's not working, it's not working, the only difference between down and disrupted is that with down it's really bad, for disrupted it's partially down

I think the explained examples are valid use cases, I have no idea why nobody else requested this yet.

Of course they're valid and if you need them, code them in, it's always an option that I point out. It's why I love open source

However with the project I'm not interested in making something bulkier than it needs to be for most people

soinu commented 3 years ago

Cachet has even more options which fit even more use cases. It has a status option for the issue (incident) itself and for the affected system (component).

Screenshot_20210220_220140

mistermantas commented 3 years ago

Cachet has even more options which fit even more use cases. It has a status option for the issue (incident) itself and for the affected system.

Screenshot_20210220_220140

Not bad but if you don't strictly speaking need something, I am skeptical

soinu commented 3 years ago

However with the project I'm not interested in making something bulkier than it needs to be for most people

Thanks for your detailed replies! Does this mean you wouldn't even accept a PR to implement a toggle and / or an additional severity level?

If it's not working, it's not working, the only difference between down and disrupted is that with down it's really bad, for disrupted it's partially down

Well, that's the point. Performance issue means most or everything of the system is working most of the time, but there might be hiccups

mistermantas commented 3 years ago

However with the project I'm not interested in making something bulkier than it needs to be for most people

Thanks for your detailed replies! Does this mean you wouldn't even accept a PR to implement a toggle and / or an additional severity level?

At the moment, no, because I've not released a new version in a while.

Personally, I want something described better than performance-issues. Too long and it sounds too specific. So to answer your question, in principal, I understand the need and would accept a PR if there was a discussion/demand for this.

mistermantas commented 3 years ago

Well, that's the point. Performance issue means most or everything of the system is working most of the time, but there might be hiccups

See that just sounds like a disruption, even if you can't call it a partial outage

soinu commented 3 years ago

Personally, I want something described better than performance-issues. Too long and it sounds too specific.

Shopify uses Degraded and Partial Outage https://www.shopifystatus.com/

statuspage.io uses degraded performance

status.io uses Degraded Performance, Partial Service Disruption and Service Disruption

freshstatus from freshworks uses Degraded Performance, Partial Outage, Major Outage and Maintenance

mistermantas commented 3 years ago

I'm closing this issue for now because after digging into the codebase where severities are used and there's tons of if statements, I think it's irresponsible to complicate the code even more.

As I've mentioned before, I don't think having a "partially down" kind of status is important and provided my workarounds (changing language file, changing how you look at the severities themselves).

If you really need that extra severity, consider going with a different status page. I see this one as a free alternative for hobbyists and open source fans which can be separate from other infrastructure they might be using so it's a bit more of a legitimate option. Sorry, but thanks for checking out the project and taking the time to submit the idea!