risingwavelabs / risingwave

SQL stream processing, analytics, and management. We decouple storage and compute to offer efficient joins, instant failover, dynamic scaling, speedy bootstrapping, and concurrent query serving.
https://www.risingwave.com/slack
Apache License 2.0
6.6k stars 543 forks source link

feat(observability): a command check cluster information & status #12826

Open fuyufjh opened 9 months ago

fuyufjh commented 9 months ago

As requested by users, we need a command to quickly tell the status of cluster, including

zwang28 commented 7 months ago

Nodes and their heartbeats e.g. the number/IP/resource/etc. of all registered nodes Brief runtime info e.g. Current epoch, latency, etc. Brief catalog info e.g. How any MVs or actors are running? Misc. RW version, uptime, etc.

The above-mentioned info are now available through several system tables. Are we supposed to further add another new command to summarize them?

zwang28 commented 7 months ago

I'm working on a (python) script to gather information from RisingWave kernel and prometheus, and generate warnings if any abnormal metric is found.

fuyufjh commented 7 months ago

related https://github.com/risingwavelabs/risingwave/pull/13764

shanicky commented 7 months ago

after #10886 #10905 #13764

We can directly connect to etcd to dump the current table fragments, workers, users, election members, and various types of catalogs, similar to the kubectl get command. You can specify the output format (yaml, json) using -o. Please note that the output could be very large

This provides a simple method to observe the data stored in etcd before the sql backend officially goes live.

risectl debug dump table,worker,table-catalog -o yaml

- kind: worker
  item:
    id: 1
    type: WORKER_TYPE_COMPUTE_NODE
    host:
      host: 127.0.0.1
      port: 5688
    state: RUNNING
    parallelUnits:
    - workerNodeId: 1
    - id: 1
      workerNodeId: 1
    - id: 2
      workerNodeId: 1
    - id: 3
      workerNodeId: 1
    property:
      isStreaming: true
      isServing: true
    transactionalId: 0
- kind: table
  item:
    tableId: 2001
    state: CREATED
    fragments:
      1001:
        fragmentId: 1001
        fragmentTypeMask: 2
        distributionType: SINGLE
        actors:
        - actorId: 1001
          fragmentId: 1001
          upstreamActorId:
          - 1002
          - 1003
          - 1004
          - 1005
          mviewDefinition: CREATE MATERIALIZED VIEW m2 AS SELECT max(v) FROM t
......
- kind: table_catalog
  item:
    id: 1002
    name: m
    columns:
    - columnDesc:
        columnType:
          typeName: INT32
          isNullable: true
        columnId: 1
        name: v
    - columnDesc:
        columnType:
......
zwang28 commented 4 months ago

TODO: add table catalog after redaction is supported.

github-actions[bot] commented 4 weeks ago

This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.