apache / pulsar

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org/
Apache License 2.0
14.13k stars 3.57k forks source link

PIP-123: Introduce Pulsar metadata CLI tool #13346

Open merlimat opened 2 years ago

merlimat commented 2 years ago

Motivation

For a very long time, we have included a CLI command to start the ZooKeeper shell utility: pulsar zookeeper-shell, which is essentially a repackaging of the ZooKeeper tool zkCli.sh.

This is useful in some cases to either verify the content of metadata or to perform cleanup and modification tasks for which there is not an available option through the Pulsar REST APIs.

While it's very useful, there are some drawbacks with the zookeeper-shell as it is today:

 1. This is only a ZooKeeper tool (obviously). Since we are adding more     metadata backends, we should have a tool that works across all the     implementations and presents a single consistent interface.

 2. ZooKeeper shell is designed to be an interactive shell and it's not very     good when trying to do non-interactive scriptable operations.

 3. ZooKeeper is a bit clunky when using it and it requires the user to retype     paths many times. The commands are not very intuitive or documented.     It's not possible to update z-node with multi-lines content.

 4. We cannot easily add functionality or improvements into ZooKeeper shell,     since it belongs to a different project and the tool has been stagnating     for many years.

 5. In cases where the z-nodes content is binary (Protobuf) or compressed, there     is no easy way to inspect the content from the ZooKeeper shell.     Additionally, we can format and colorize JSON content to make it easier to     read.

 6. The paths used for metadata resources are also often using encodings that     make it more difficult to construct on the shell tool.

Part of what is described here is in the pulsar-managed-ledger-admin CLI tool, though that is a Python script that requires additional dependencies that are not typically installed, it only works with ZooKeeper, and it only targets accessing the managed ledger metadata.

Goal

Introduce a new Java CLI tool to access, inspect and modify metadata that solves all the issues described above.

We would leave the zookeeper-shell command for now. In the future, once the new tool is proven, we can consider removing the zookeeper-shell command.

Proposed changes

Add a new command:

bin/pulsar metadata

with several subcommands:

Get

Examples:

# General path get
$ pulsar metadata get /my-path

# Topic metadata
$ pulsar metadata get topic my-tenant/my-namespace/my-topic
{
  # Managed ledger metadata
}

# Namespace get
$ pulsar metadata get namespace my-tenant/my-namespace
{
  # Namespace metadata
}

$ pulsar metadata get ledger 12345
{
  # BK ledger metadata
}

Delete

Examples:

# General path delete
$ pulsar metadata delete /my-path

# Topic metadata
$ pulsar metadata delete topic my-tenant/my-namespace/my-topic

Scan

Examples:

$ pulsar metadata scan /my-path
/my-path
/my-path/1
/my-path/2
/my-path/3
/my-path/3/1

$ pulsar metadata scan --values /my-path
/my-path
{value}

/my-path/1
{value}

/my-path/2
{value}

Shell

$ pulsar metadata shell
> get topic my-tenant/my-namespace/my-topic
{
  # Managed ledger metadata
}

> delete topic my-tenant/my-namespace/my-topic

> cd /my-path
> ls
1
2
3
> delete 1 # Delete keys with relative paths
mattisonchao commented 2 years ago

I think it's will be very useful.

mattisonchao commented 2 years ago

@merlimat

After this pip is passed, can I help?

github-actions[bot] commented 2 years ago

The issue had no activity for 30 days, mark with Stale label.

shibd commented 2 years ago

I will try it.

github-actions[bot] commented 2 years ago

The issue had no activity for 30 days, mark with Stale label.

kaori-seasons commented 1 year ago

This looks interesting I'll be looking into it for a while