netdata / netdata-cloud

The public repository of Netdata Cloud. Contribute with bug reports and feature requests.
GNU General Public License v3.0
41 stars 16 forks source link

[Feat]: Add the ability for Netdata to suggest better configuration parameters (where possible) #549

Open shyamvalsan opened 2 years ago

shyamvalsan commented 2 years ago

Description

Give Netdata the ability to make recommendations or suggestions of what values certain system or application parameters should be configured to. Ideally this would be done per collector.

Of course this will only be applicable for a select list of metrics where the intelligence on how to calculate an ideal value or range of values can be encoded into Netdata.

Some recommendations will be immediate in nature, while others will require Netdata to monitor the system for a certain amount of time (for example to learn load patterns OR observe anomaly rates).

An example (from PostgreSQL):

Setting max_connectionstoo low or leaving it at the default could cause the user to run out of connections. But setting it too high could lead to even more severe consequences such as overloading the database or having insufficient resources for each connection.

Ideally you want to utilize the resources at your disposal without overloading the system. A rubric to use for an upper limit of max_connectionsis:

max_connections < max(num_cores, parallel_io_limit) / (session_busy_ratio * avg_parallelism)

Where:

Netdata should be able to measure all these variables and make a reasonable calculation of what the upper limit for max_connections should be.

Of course, we did not mention connection pooling thus far, which is another recommendation Netdata should give the user if their use-case demands it (for example if the calculated upper limit for max_connections is still too high, then it is better to recommend they use a connection pooling solution such as pgBouncer.

Importance

nice to have

Value proposition

  1. Users get automated recommendations of how to better configure their systems and applications - saves time & resources.
  2. Context and extra information along with provided rationale helps users gain expertise on the systems they monitor - increase user value.
shyamvalsan commented 2 years ago

cc: @ktsaou @cakrit @amalkov @ralphm @sashwathn @hugovalente-pm an idea that came up during the use-case work. Might have already been discussed in the past.

This could be something that's taken up at a per metric level to start with and scaled up gradually, doesn't need to be a big giant undertaking.

sashwathn commented 2 years ago

@shyamvalsan : This is definitely a good to have feature but may be an almost impossible task to go through all the collectors and identify the various configuration options. But this should definitely be the kind of information we share in our alert info / reference documentation at least for the key parameters.

shyamvalsan commented 2 years ago

@sashwathn We don't need to do it for all collectors and all config parameters - just for the ones we identify while adding new collectors (Eg: Postgres) and gradually it will build up to something significant over time.

Btw good call on including this as part of alert info..

My intent with this was how can we increase the "value" users get out of Netdata. Beyond just monitoring metric values, can they also get guidance from netdata on better configuring their system.

hugovalente-pm commented 10 months ago

do you think something like this could/should be somehow linked to future work on the Netdata Assistant?

shyamvalsan commented 10 months ago

@hugovalente-pm yes, that's a good idea, the long term idea for assistant is to be a home for insights. Recommendations could be part of this.