yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.87k stars 1.05k forks source link

[DocDB] The master `/api/v1/tablet-servers` endpoint might display incorrect/incomplete data #15655

Open fritshoogland-yugabyte opened 1 year ago

fritshoogland-yugabyte commented 1 year ago

Jira Link: DB-5024

Description

YugabyteDB 2.17.0.0b24 Linux Alma8.7

When multiple tablet servers are stopped, the HTTP endpoint name seems to be removed from the metadata the master keeps:

image

And the tablet server UUID is used.

However, in the endpoint /api/v1/tablet-servers, it uses a different construction than /api/v1/masters, and show a map of objects (being the tablet servers) inside an unnamed object (?).

The map of objects (which are the tablet servers) is listed by the name of the HTTP endpoint name. If the HTTP endpoint name is removed (like can be seen in the screenshot above), the name gets empty. If multiple names get empty, still only one is showed. This leads to incomplete data being shown.

Please list the tablet servers by their UUID, like is done for the masters, so /api/v1/tablet-servers can show the truth.

Data in /api/v1/tablet-servers for above screenshot situation:

❯ curl -s http://192.168.66.80:7000/api/v1/tablet-servers | jq
{
  "": {
    "": {
      "time_since_hb": "647.3s",
      "time_since_hb_sec": 647.276403855,
      "status": "DEAD",
      "uptime_seconds": 0,
      "ram_used": "0 B",
      "ram_used_bytes": 0,
      "num_sst_files": 0,
      "total_sst_file_size": "0 B",
      "total_sst_file_size_bytes": 0,
      "uncompressed_sst_file_size": "0 B",
      "uncompressed_sst_file_size_bytes": 0,
      "path_metrics": [],
      "read_ops_per_sec": 0,
      "write_ops_per_sec": 0,
      "user_tablets_total": 0,
      "user_tablets_leaders": 0,
      "system_tablets_total": 4,
      "system_tablets_leaders": 0,
      "active_tablets": 0,
      "cloud": "local",
      "region": "local",
      "zone": "local5"
    },
    "yb-8.local:9000": {
      "time_since_hb": "164.8s",
      "time_since_hb_sec": 164.814628727,
      "status": "DEAD",
      "uptime_seconds": 0,
      "ram_used": "0 B",
      "ram_used_bytes": 0,
      "num_sst_files": 0,
      "total_sst_file_size": "0 B",
      "total_sst_file_size_bytes": 0,
      "uncompressed_sst_file_size": "0 B",
      "uncompressed_sst_file_size_bytes": 0,
      "path_metrics": [],
      "read_ops_per_sec": 0,
      "write_ops_per_sec": 0,
      "user_tablets_total": 0,
      "user_tablets_leaders": 0,
      "system_tablets_total": 4,
      "system_tablets_leaders": 0,
      "active_tablets": 4,
      "cloud": "local",
      "region": "local2",
      "zone": "local2"
    },
    "yb-3.local:9000": {
      "time_since_hb": "0.3s",
      "time_since_hb_sec": 0.260432171,
      "status": "ALIVE",
      "uptime_seconds": 579,
      "ram_used": "25.13 MB",
      "ram_used_bytes": 25133056,
      "num_sst_files": 0,
      "total_sst_file_size": "0 B",
      "total_sst_file_size_bytes": 0,
      "uncompressed_sst_file_size": "0 B",
      "uncompressed_sst_file_size_bytes": 0,
      "path_metrics": [
        {
          "path": "/mnt/d0",
          "space_used": 168222720,
          "total_space_size": 10724835328
        }
      ],
      "read_ops_per_sec": 0,
      "write_ops_per_sec": 0,
      "user_tablets_total": 0,
      "user_tablets_leaders": 0,
      "system_tablets_total": 4,
      "system_tablets_leaders": 3,
      "active_tablets": 4,
      "cloud": "local",
      "region": "local3",
      "zone": "local3"
    },
    "yb-1.local:9000": {
      "time_since_hb": "0.3s",
      "time_since_hb_sec": 0.270004779,
      "status": "ALIVE",
      "uptime_seconds": 636,
      "ram_used": "25.31 MB",
      "ram_used_bytes": 25313280,
      "num_sst_files": 0,
      "total_sst_file_size": "0 B",
      "total_sst_file_size_bytes": 0,
      "uncompressed_sst_file_size": "0 B",
      "uncompressed_sst_file_size_bytes": 0,
      "path_metrics": [
        {
          "path": "/mnt/d0",
          "space_used": 167034880,
          "total_space_size": 10724835328
        }
      ],
      "read_ops_per_sec": 0,
      "write_ops_per_sec": 0,
      "user_tablets_total": 0,
      "user_tablets_leaders": 0,
      "system_tablets_total": 4,
      "system_tablets_leaders": 2,
      "active_tablets": 4,
      "cloud": "local",
      "region": "local1",
      "zone": "local1"
    },
    "yb-4.local:9000": {
      "time_since_hb": "0.4s",
      "time_since_hb_sec": 0.395968008,
      "status": "ALIVE",
      "uptime_seconds": 549,
      "ram_used": "32.93 MB",
      "ram_used_bytes": 32931840,
      "num_sst_files": 0,
      "total_sst_file_size": "0 B",
      "total_sst_file_size_bytes": 0,
      "uncompressed_sst_file_size": "0 B",
      "uncompressed_sst_file_size_bytes": 0,
      "path_metrics": [
        {
          "path": "/mnt/d0",
          "space_used": 147513344,
          "total_space_size": 10724835328
        }
      ],
      "read_ops_per_sec": 0,
      "write_ops_per_sec": 0,
      "user_tablets_total": 0,
      "user_tablets_leaders": 0,
      "system_tablets_total": 4,
      "system_tablets_leaders": 2,
      "active_tablets": 4,
      "cloud": "local",
      "region": "local1",
      "zone": "local1"
    },
    "yb-9.local:9000": {
      "time_since_hb": "0.1s",
      "time_since_hb_sec": 0.074897845,
      "status": "ALIVE",
      "uptime_seconds": 336,
      "ram_used": "35.65 MB",
      "ram_used_bytes": 35651584,
      "num_sst_files": 0,
      "total_sst_file_size": "0 B",
      "total_sst_file_size_bytes": 0,
      "uncompressed_sst_file_size": "0 B",
      "uncompressed_sst_file_size_bytes": 0,
      "path_metrics": [
        {
          "path": "/mnt/d0",
          "space_used": 142802944,
          "total_space_size": 10724835328
        }
      ],
      "read_ops_per_sec": 0,
      "write_ops_per_sec": 0,
      "user_tablets_total": 0,
      "user_tablets_leaders": 0,
      "system_tablets_total": 4,
      "system_tablets_leaders": 1,
      "active_tablets": 4,
      "cloud": "local",
      "region": "local3",
      "zone": "local3"
    },
    "yb-7.local:9000": {
      "time_since_hb": "1.0s",
      "time_since_hb_sec": 0.954664577,
      "status": "ALIVE",
      "uptime_seconds": 415,
      "ram_used": "24.89 MB",
      "ram_used_bytes": 24887296,
      "num_sst_files": 0,
      "total_sst_file_size": "0 B",
      "total_sst_file_size_bytes": 0,
      "uncompressed_sst_file_size": "0 B",
      "uncompressed_sst_file_size_bytes": 0,
      "path_metrics": [
        {
          "path": "/mnt/d0",
          "space_used": 142950400,
          "total_space_size": 10724835328
        }
      ],
      "read_ops_per_sec": 0,
      "write_ops_per_sec": 0,
      "user_tablets_total": 0,
      "user_tablets_leaders": 0,
      "system_tablets_total": 4,
      "system_tablets_leaders": 2,
      "active_tablets": 4,
      "cloud": "local",
      "region": "local1",
      "zone": "local1"
    },
    "yb-6.local:9000": {
      "time_since_hb": "0.9s",
      "time_since_hb_sec": 0.949073219,
      "status": "ALIVE",
      "uptime_seconds": 459,
      "ram_used": "32.16 MB",
      "ram_used_bytes": 32161792,
      "num_sst_files": 0,
      "total_sst_file_size": "0 B",
      "total_sst_file_size_bytes": 0,
      "uncompressed_sst_file_size": "0 B",
      "uncompressed_sst_file_size_bytes": 0,
      "path_metrics": [
        {
          "path": "/mnt/d0",
          "space_used": 134942720,
          "total_space_size": 10724835328
        }
      ],
      "read_ops_per_sec": 0,
      "write_ops_per_sec": 0,
      "user_tablets_total": 0,
      "user_tablets_leaders": 0,
      "system_tablets_total": 4,
      "system_tablets_leaders": 2,
      "active_tablets": 4,
      "cloud": "local",
      "region": "local3",
      "zone": "local3"
    }
  }
}
bmatican commented 1 year ago

@fritshoogland-yugabyte can you share a bit more about how you got into this state? It's really strange that the first 2 tservers have no hostname, but the last one, that's also dead, does showcase the hostname.

The main difference seems to be that it still exists in some quorums (has 4 peers on it). Did it also stop showing the hostname, after it was finally kicked out of its quorums?

fritshoogland-yugabyte commented 1 year ago

It's been a while since I found this state, but I found a scenario where I can get the situation where the master/tablet-servers page shows a number of DEAD tablet servers without their http address name, which causes the master/api/v1/tablet-servers to fail showing all the DEAD tablet servers, because they are listed by http address name, and therefore only one with an empty name can be shown.

  1. create a cluster with a number of tablet servers. for example 6 tablet servers.
    ➜ yb_stats --print-tablet-servers
    yb-1.local:9000      ALIVE Placement: local.local.local1
                     HB time: 0.4s, Uptime: 170, Ram 57.90 MB
                     SST files: nr: 0, size: 0 B, uncompressed: 0 B
                     ops read: 0, write: 0
                     tablets: active: 12, user (leader/total): 0/0, system (leader/total): 4/12
                     Path: /mnt/d0, total: 10724835328, used: 191152128 (1.78%)
    yb-2.local:9000      ALIVE Placement: local.local.local2
                     HB time: 0.5s, Uptime: 146, Ram 56.62 MB
                     SST files: nr: 0, size: 0 B, uncompressed: 0 B
                     ops read: 0, write: 0
                     tablets: active: 12, user (leader/total): 0/0, system (leader/total): 4/12
                     Path: /mnt/d0, total: 10724835328, used: 201392128 (1.88%)
    yb-3.local:9000      ALIVE Placement: local.local.local3
                     HB time: 0.4s, Uptime: 122, Ram 40.83 MB
                     SST files: nr: 0, size: 0 B, uncompressed: 0 B
                     ops read: 0, write: 0
                     tablets: active: 12, user (leader/total): 0/0, system (leader/total): 4/12
                     Path: /mnt/d0, total: 10724835328, used: 197783552 (1.84%)
    yb-4.local:9000      ALIVE Placement: local.local.local4
                     HB time: 0.5s, Uptime: 97, Ram 51.38 MB
                     SST files: nr: 0, size: 0 B, uncompressed: 0 B
                     ops read: 0, write: 0
                     tablets: active: 12, user (leader/total): 0/0, system (leader/total): 4/12
                     Path: /mnt/d0, total: 10724835328, used: 170307584 (1.59%)
    yb-5.local:9000      ALIVE Placement: local.local.local5
                     HB time: 0.7s, Uptime: 73, Ram 51.38 MB
                     SST files: nr: 0, size: 0 B, uncompressed: 0 B
                     ops read: 0, write: 0
                     tablets: active: 12, user (leader/total): 0/0, system (leader/total): 4/12
                     Path: /mnt/d0, total: 10724835328, used: 169508864 (1.58%)
    yb-6.local:9000      ALIVE Placement: local.local.local6
                     HB time: 0.5s, Uptime: 47, Ram 58.72 MB
                     SST files: nr: 0, size: 0 B, uncompressed: 0 B
                     ops read: 0, write: 0
                     tablets: active: 12, user (leader/total): 0/0, system (leader/total): 4/12
                     Path: /mnt/d0, total: 10724835328, used: 158294016 (1.48%)
  2. stop more than one tablet server. this works as expected: the tablet server will be shown with its http name and after 60 seconds in the state DEAD. this works as expected.
    ➜ yb_stats --print-tablet-servers
    yb-1.local:9000      DEAD Placement: local.local.local1
                     HB time: 86.7s, Uptime: 0, Ram 0 B
                     SST files: nr: 0, size: 0 B, uncompressed: 0 B
                     ops read: 0, write: 0
                     tablets: active: 12, user (leader/total): 0/0, system (leader/total): 0/12
    yb-2.local:9000      DEAD Placement: local.local.local2
                     HB time: 74.7s, Uptime: 0, Ram 0 B
                     SST files: nr: 0, size: 0 B, uncompressed: 0 B
                     ops read: 0, write: 0
                     tablets: active: 12, user (leader/total): 0/0, system (leader/total): 5/12
    yb-3.local:9000      DEAD Placement: local.local.local3
                     HB time: 72.7s, Uptime: 0, Ram 0 B
                     SST files: nr: 0, size: 0 B, uncompressed: 0 B
                     ops read: 0, write: 0
                     tablets: active: 12, user (leader/total): 0/0, system (leader/total): 5/12
    yb-4.local:9000      ALIVE Placement: local.local.local4
                     HB time: 0.5s, Uptime: 242, Ram 35.01 MB
                     SST files: nr: 0, size: 0 B, uncompressed: 0 B
                     ops read: 0, write: 0
                     tablets: active: 12, user (leader/total): 0/0, system (leader/total): 5/12
                     Path: /mnt/d0, total: 10724835328, used: 170307584 (1.59%)
    yb-5.local:9000      ALIVE Placement: local.local.local5
                     HB time: 0.4s, Uptime: 218, Ram 34.04 MB
                     SST files: nr: 0, size: 0 B, uncompressed: 0 B
                     ops read: 0, write: 0
                     tablets: active: 12, user (leader/total): 0/0, system (leader/total): 4/12
                     Path: /mnt/d0, total: 10724835328, used: 169508864 (1.58%)
    yb-6.local:9000      ALIVE Placement: local.local.local6
                     HB time: 0.5s, Uptime: 193, Ram 36.84 MB
                     SST files: nr: 0, size: 0 B, uncompressed: 0 B
                     ops read: 0, write: 0
                     tablets: active: 12, user (leader/total): 0/0, system (leader/total): 5/12
                     Path: /mnt/d0, total: 10724835328, used: 158089216 (1.47%)
  3. Now all that is needed, is the master leader to change:
    ➜ yb_stats --print-masters
    52cfc19881954ca0b866ea3571abe726 LEADER Placement: local.local.local
                                 Seqno: 1674563401730224 Start time: 1674563401730224
                                 RPC addresses: ( yb-1.local:7100 )
                                 HTTP addresses: ( yb-1.local:7000 )
    6e82dcf19d3a4435b9fe27b8c5a97b1a FOLLOWER Placement: local.local.local
                                 Seqno: 1674563426056547 Start time: 1674563426056547
                                 RPC addresses: ( yb-2.local:7100 )
                                 HTTP addresses: ( yb-2.local:7000 )
    22e36d9b728841a485d395dedcbaa0ec FOLLOWER Placement: local.local.local
                                 Seqno: 1674563451490385 Start time: 1674563451490385
                                 RPC addresses: ( yb-3.local:7100 )
                                 HTTP addresses: ( yb-3.local:7000 )

    Current leader is yb-1.local / 52cfc19881954ca0b866ea3571abe726

Let's first put a watch on the cluster to see what changes by using yb_stats --adhoc-nonmetrics-diff (show diff, but not for metrics: we don't care about these, this is not about performance).

➜ yb_stats --adhoc-nonmetrics-diff
Begin ad-hoc in-memory snapshot created, press enter to create end snapshot for difference calculation.

And then make the master move leader state to a follower:

yb-admin -init_master_addrs localhost:7100 master_leader_stepdown 6e82dcf19d3a4435b9fe27b8c5a97b1a

Then press enter with yb_stats to see what has happened:

Time between snapshots:  124.432 seconds
= Masters:  52cfc19881954ca0b866ea3571abe726 Role: LEADER->FOLLOWER Placement: local.local.local
                                             Seq#: 1674563401730224 Start time: 1674563401730224
                                             RPC: yb-1.local:7100,
                                             HTTP: yb-1.local:7000,
= Masters:  6e82dcf19d3a4435b9fe27b8c5a97b1a Role: FOLLOWER->LEADER Placement: local.local.local
                                             Seq#: 1674563426056547 Start time: 1674563426056547
                                             RPC: yb-2.local:7100,
                                             HTTP: yb-2.local:7000,
+ Tserver:  , status: DEAD, uptime: 0 s
- Tserver:  yb-1.local:9000, status: DEAD, uptime: 0 s
- Tserver:  yb-2.local:9000, status: DEAD, uptime: 0 s
- Tserver:  yb-3.local:9000, status: DEAD, uptime: 0 s

Master yb-1 changed from LEADER to FOLLOWER, and master yb-2 changed from FOLLOWER to LEADER, as expected. For the servers, there are 3 things gone away ('-'), which are tablet server node yb-1, yb-2, yb-3, and there is one tserver "added" ('+').

We can guess how the tablet servers view looks like:

➜ yb_stats --print-tablet-servers
                     DEAD Placement: local.local.local3
                     HB time: 837.8s, Uptime: 0, Ram 0 B
                     SST files: nr: 0, size: 0 B, uncompressed: 0 B
                     ops read: 0, write: 0
                     tablets: active: 0, user (leader/total): 0/0, system (leader/total): 0/4
yb-4.local:9000      ALIVE Placement: local.local.local4
                     HB time: 0.2s, Uptime: 779, Ram 35.84 MB
                     SST files: nr: 0, size: 0 B, uncompressed: 0 B
                     ops read: 0, write: 0
                     tablets: active: 12, user (leader/total): 0/0, system (leader/total): 5/12
                     Path: /mnt/d0, total: 10724835328, used: 161931264 (1.51%)
yb-5.local:9000      ALIVE Placement: local.local.local5
                     HB time: 0.3s, Uptime: 750, Ram 36.20 MB
                     SST files: nr: 0, size: 0 B, uncompressed: 0 B
                     ops read: 0, write: 0
                     tablets: active: 12, user (leader/total): 0/0, system (leader/total): 4/12
                     Path: /mnt/d0, total: 10724835328, used: 161136640 (1.50%)
yb-6.local:9000      ALIVE Placement: local.local.local6
                     HB time: 0.2s, Uptime: 724, Ram 38.85 MB
                     SST files: nr: 0, size: 0 B, uncompressed: 0 B
                     ops read: 0, write: 0
                     tablets: active: 12, user (leader/total): 0/0, system (leader/total): 5/12
                     Path: /mnt/d0, total: 10724835328, used: 149716992 (1.40%)

The previously not available/DEAD tablet servers are removed from the view, and one entry is added with an empty name. Based on the placement, which is unique for each tablet server in my (specific!) configuration, we can see it's yb-3, but without that, there is no way to identify which node the unnamed node is, because the only way to truly actually identify a tablet server is by its UUID.

The huge issue is that if you rely on this view, you get incomplete information, because these now unnamed nodes might actually host system or user tablets. All in all, these tablet servers must be shown to provide the actual information. And they are in master/tablet-servers.

Tablet servers are used internally by their UUID. The masters are simply listed without a name in /api/v1/masters, but each listed master does contain its unique UUID. It's confusing that the tablet servers equivalent endpoint is formatted radically different, and does not contain the vital data of its unique UUID and sequence number.

The view master/api/v1/health-check shows dead nodes, which are actually dead TABLET SERVERS by their UUID. For which there is no view to tell what the properties are of these tablet servers.

(master/dump-entities actually does show server uuid, together with the "addr", which is the http name, which is an odd place to do show it, if the UUID has a specification that can be obtained, all that is needed there is the UUID?)

Besides the UUID missing from the /api/v1/tablet-servers view, several fields are missing which are very helpful for determining the state of the tablet servers:

fritshoogland-yugabyte commented 1 year ago

I think a more appropriate name for this endpoint would be: /api/v1/tablet-server-performance because it combines status and performance data per tablet server name.