netdata / blog

Netdata's Blog!
https://blog.netdata.cloud
MIT License
6 stars 3 forks source link

Hidden Costs of Monitoring Blog #293

Closed sashwathn closed 1 year ago

sashwathn commented 1 year ago

Please review and approve.

netlify[bot] commented 1 year ago

Deploy Preview for netdata-blog ready!

Name Link
Latest commit cacfe22aa5cbddd46f7718c399d37a4d78db7286
Latest deploy log https://app.netlify.com/sites/netdata-blog/deploys/64a7e0ae8c2f900008eb8c8e
Deploy Preview https://deploy-preview-293--netdata-blog.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

andrewm4894 commented 1 year ago

You have no truncate

andrewm4894 commented 1 year ago

The image is stale and not matching

andrewm4894 commented 1 year ago

@ralphm I think be good to get your review on this to. All seems pretty reasonable and well conveyed to me but just wanted to ping you as one of our resident experts on Grafana and Prometheus.

andrewm4894 commented 1 year ago

I wonder if could be some way to summarize the key points and put them as bullets somewhere near the top.

Is lots of good points in here but they a little buried and readers who maybe just want to do a quick scan will miss them.

So thinking could be worth a sort of summary or tldr section near top to list out the main points that are expanded below.

sashwathn commented 1 year ago

The image is stale and not matching

Fixed it

sashwathn commented 1 year ago

I wonder if could be some way to summarize the key points and put them as bullets somewhere near the top.

Is lots of good points in here but they a little buried and readers who maybe just want to do a quick scan will miss them.

So thinking could be worth a sort of summary or tldr section near top to list out the main points that are expanded below.

@andrewm4894: Yes, it will be good to have a summary, but I am not sure where it goes and how we put it.. Any suggestions?

hugovalente-pm commented 1 year ago

@andrewm4894: Yes, it will be good to have a summary, but I am not sure where it goes and how we put it.. Any suggestions?

maybe a paragraph right before the "Prometheus and Grafana..." section, the summary could be more like hinting to the user what will be covered on the blogpost

In this blogpost we will cover the analysis of two traditional monitoring domains, Open Source observability and Commercial Centralized observability solutions, focusing the direct and indirect impacts when implementing these solution. In summary:

  • IT teams face challenges with conventional monitoring tools due to their complexity, time-consuming setup procedures and steep learning curve.
  • Traditional monitoring systems often pose a balancing act between data quality and quantity vs system and cost overheads.
  • Many traditional monitoring systems may result in unforeseen expenses due to data transfer egress costs associated with their operations.
  • There also often an issue with the retention of data, as keeping data for a longer period can result in elevated costs.
  • An effective monitoring tool should ideally have a simplistic approach, offer customizable features, and have the capability to scale as per requirements.
  • Real-time insights with minimal configuration and minimal system impact should be a key feature of an efficient monitoring system.
  • Detailed monitoring acquired through granular, high-resolution metrics can improve the quality of insights and so reducing the time to troubleshoot.
  • Tool with enhanced usability decrease the need of training or hiring specialized individuals, individuals with varying ranges of expertise should readily understand and use it.
  • The adoption of an optimal tool can lead to significant cost savings, increased transparency, and improved system reliability.
  • There is a need to adopt monitoring tools that are simple, customizable, and scalable to address these challenges and hidden costs.

p.s.: this was quickly put up with the help of ChatGPT

sashwathn commented 1 year ago

@andrewm4894: Yes, it will be good to have a summary, but I am not sure where it goes and how we put it.. Any suggestions?

maybe a paragraph right before the "Prometheus and Grafana..." section, the summary could be more like hinting to the user what will be covered on the blogpost

In this blogpost we will cover the analysis of two traditional monitoring domains, Open Source observability and Commercial Centralized observability solutions, focusing the direct and indirect impacts when implementing these solution. In summary:

  • IT teams face challenges with conventional monitoring tools due to their complexity, time-consuming setup procedures and steep learning curve.
  • Traditional monitoring systems often pose a balancing act between data quality and quantity vs system and cost overheads.
  • Many traditional monitoring systems may result in unforeseen expenses due to data transfer egress costs associated with their operations.
  • There also often an issue with the retention of data, as keeping data for a longer period can result in elevated costs.
  • An effective monitoring tool should ideally have a simplistic approach, offer customizable features, and have the capability to scale as per requirements.
  • Real-time insights with minimal configuration and minimal system impact should be a key feature of an efficient monitoring system.
  • Detailed monitoring acquired through granular, high-resolution metrics can improve the quality of insights and so reducing the time to troubleshoot.
  • Tool with enhanced usability decrease the need of training or hiring specialized individuals, individuals with varying ranges of expertise should readily understand and use it.
  • The adoption of an optimal tool can lead to significant cost savings, increased transparency, and improved system reliability.

p.s.: this was quickly put up with the help of ChatGPT

  • There is a need to adopt monitoring tools that are simple, customizable, and scalable to address these challenges and hidden costs.

This is great, let me add this.

sashwathn commented 1 year ago

@andrewm4894 @shyamvalsan : I have made some changes to the summary.

andrewm4894 commented 1 year ago

wondering - would we go as far as make a sort of matrix of pros and cons or matrix of features or considerations when comparing all solutions?

eg something like below - this is just a dummy example. But idea being a few different dimensions of things to consider.

maybe some sort of rating or rough scale to measure each cell in or something.

Consideration Prometheus & Grafana Centralized Commercial Offerings Netdata
flexibility ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐
powerful features ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐
scale ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐
cost ⭐⭐⭐ ⭐⭐⭐⭐⭐
maintenance overhead ⭐⭐ ⭐⭐⭐ ⭐⭐⭐⭐⭐
configuration overhead ⭐⭐ ⭐⭐⭐⭐⭐
??? ⭐⭐⭐ ⭐⭐⭐⭐⭐

might be a bit much but just throwing idea out there

sashwathn commented 1 year ago

wondering - would we go as far as make a sort of matrix of pros and cons or matrix of features or considerations when comparing all solutions?

eg something like below - this is just a dummy example. But idea being a few different dimensions of things to consider.

maybe some sort of rating or rough scale to measure each cell in or something.

Consideration Prometheus & Grafana Centralized Commercial Offerings Netdata flexibility ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐ powerful features ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐ scale ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐⭐ cost ⭐⭐⭐ ⭐ ⭐⭐⭐⭐⭐ maintenance overhead ⭐⭐ ⭐⭐⭐ ⭐⭐⭐⭐⭐ configuration overhead ⭐ ⭐⭐ ⭐⭐⭐⭐⭐ ??? ⭐⭐⭐ ⭐ ⭐⭐⭐⭐⭐ might be a bit much but just throwing idea out there

@andrewm4894 : We are planning on creating all of these as part of the competitive study with some targeted posts and we should include such a comparison there. We have this one for Netdata Agent from the past and should extend it: https://docs.google.com/spreadsheets/d/1p6BrAQQj--tV_Ov2NPdX86X_6f78gkMrjKbneH588iA/edit#gid=1901557118