yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.99k stars 1.07k forks source link

[DocDB] Simplify Flags and provide SQL-interface to report them. #23929

Open pdvmoto opened 1 month ago

pdvmoto commented 1 month ago

Jira Link: DB-12830

Description

To simplify YB-deployments, I would suggest YB to reduce the number of “settable” parameters, to provide more information and some “management” around the remaining parameters (e.g. prevent+warn of illegal combinations). And provide a list of non-default settings in a view of logfile after startup.

Current (documented, 14Sept2024) lists of parameters can be found documented for master and server :

https://docs.yugabyte.com/preview/reference/configuration/yb-master/

https://docs.yugabyte.com/preview/reference/configuration/all-flags-yb-tserver/

Note I tend to use a yugabyted --config file to cover the flags I think are relevant. (such as enabling ASH, Beta-features etc..)

The amount of parameters that can be seen at the endpoints

/varz?raw is large, 1500, and I think most of them need not be modified by average deployments. Moreover, tweaking them can cause erratic behaviour and confusion when bugs/anomalies/errros are reported to support.

I would request several things in this area (my list grew to 5+2 items, numbered..):

  1. Remove as many parameters as possible from the public list, And from the reporting at endpoints. Optionally, disable “setting” of less-relevant parameters and just keep them default (choose to yes/no report the defaults of the un-settables). The docu pages for Master and TSerer list roughly 200 settings, the
    /varz?raw list about 1500 settings?

And for the remaining parameters, in 1 long web-page, list or view:

  1. Clarifiy which parameters apply to Master , TServer, or Both.
  2. State which parameters must be Identical on Master+TServer
  3. create a way to document (or enforce!) dependecies, when Param X must be GT/LT/EQ to Param Y.

Additionally, as suggested by Dorian on Slack: group some of the parameters in “namespace”, e.g. wal.interval, or even master.flagname and tserver.flagname to distinguish.

I would appreciate (documenting of) properties like:

  • APPLIES to Master/Server/Both
  • Must be IDENTICAL on all nodes. (y/n/recommended)
  • DEPENDS… like when param X must be greater/lower than param Y
  1. Reporting from SQL: I would request a view (ybsettings?) to query the parameters to see values, probably per local-node, or per (local)component, and to see the dflt + set values (e.g. was it modified form dflt?). If possible, include in this view the properties of Applies / Identical / Depends. Access to this data from the SQL prompt (rather than from an endpoint) would in my opinion enhance manageability. Similar to how pg(file_)settings can be queried.

As an aside, possibly separate request, I find the use of --ysql_pg_conf_csv with a CSV list rather un-elegant.

  1. Can the ysql_pg_conf_csv be done differently? possibly via a “namepace” postgres.= Or via a separate config-file ?
  2. The flags: follower_unavailable_considered_failed_sec and log_min_sec_to_retain have the same name on master and TServer, but seem to have different settings on master (7200 sec) and tserver (900 sec). Even if set in flag-file, the master chooses 7200 ? Any more info on this ? Possibly separate settings using different names for master and server ?

Feel free to let me know (on here or slack) what you think?

Warning: Please confirm that this issue does not contain any sensitive information

  • [X] I confirm this issue does not contain any sensitive information.
ddorian commented 1 month ago

There is SHOW ALL; in PostgreSQL. There is pg_settings view. Also current_setting() function.

group some of the parameters in “namespace”, e.g. wal.interval, or even master.flagname and tserver.flagname to distinguish.

Maybe adding a tags array for all configs to be able to filter them by component / process while still keeping the same config names.

pdvmoto commented 1 month ago

Yes, Thx for reminding me. But the pg_settings and show-all command cover only Some of my questions, not all. Notably: the list at endpoint/varz?raw is much longer, it is still mostly unclear to which component a setting belongs (master, tserver, both, or just belong to postgres..). And it doesnt specify which settings are non-default or "modified". I realize I may be asking a lot, but.. "we can ask".

hari90 commented 1 month ago

Ya we have a lot of flags 😀 The docs page yb-master and yb-tserver contain information about the most common flags (200). And the pages all-flags-yb-master and all-flags-yb-tserver contain (almost) ALL settable flags (1000ish). There are a few hundred other hidden and test flags that we do not show in the docs, since no one should use these. Also included in the yb build you can find master.xml and tserver.xml with the same info as the all-flags doc pages. These also tell you which flags belong to each process. A lot of flags belong to both since yb-master also hosts one tablet.

/varz?raw as the name suggests gives you the raw view of everything. This exists purely for backwards compatibility with older scripts. If you use /varz you will see a more user friendly version with separate sections for flags with overrides, flags with default values, and even AutoFlag states. You can also use the /api/v1/varz to get a json version.

If you are using a flags config file then you definitely need to have separate ones for master and tserver. Also you should set the flag on all nodes. Only a certain subset of flags like ip address and ports will differ, and yugabyted should take care of these for you.

The flag description should tell you about allowed values and dependencies with other flags. If you feel any flag is missing them please file a GH issue and we will address them. We also have validation functions for flags that are run before the flag is set. If you want to test a new flag value before setting it you can use the validate_flag_value option in yb-ts-cli since 2.23.

The request to view the flag configuration from SQL is interesting. Since we have node level flags like ip addresses and node uuid the output will depend on the node you connect to which you might not have control over. Please feel free to file a GH issue if you feel this will still be helpful.

ysql_pg_conf_csv does indeed get messy and is very easy to get wrong. One alternative option is to update the postgresql.conf file directly. This is used as the base template to dynamically generate ysql_pg.conf, which is passed to pg.

Hope this helps answer most of your questions.

hari90 commented 1 month ago

Here is a sample screenshot of /varz image image image image

pdvmoto commented 1 month ago

Thank you for a very good Reply. I had not found the "all flags" pages yet. Good to have those. Altough I would recommend not to tweak settings unless Absolutely Necessary. Dflts should be good-enough, and customizing will generally just create confusion , not in the last place from causing non-tested behaviour.

I like the varz?raw because I can curl and search more easily, e.g. to find anomalies or divergent settings. Having a JSON version is also good. Thx.

I distribute my flag-files to all nodes on startup or on request, and my conf-files do Not contain node-specific settings such as IP:port. Those are kept by yugabyted (in yugabyted.conf, I think...).

Viewing settings via SQL would, I think, be in line with the use of an RDBMS : anything is SQL (12 rules Date/Codd..) and it would also allow for easier monitoring (at least for a DBA used to SQL). I concur that such views may be difficult to implement due to Distributed-Nature of YB, but a good example is the use of yb_local_tablets, and the ash- and pg_stat views - those currently contain local-only data. @FranckPachot has already built a concept-example of how to "view" data from separate nodes using GV (Global Views).

pdvmoto commented 1 month ago

on the use of postgres.conf and ysql_pg.conf: is that an official, supported way to alter the postgres-flags ? I would probably Recommend doing it in exactly That Way. It removes the funny csv-list, and it makes it Clear that those settings apply to the PG-layer. E.g. I would append my own settings (from a file I maintain..) to the bottom of postgresql.conf, which would then become part of ysql_pg.conf.