redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.65k stars 588 forks source link

rpk: debug bundle allow admins/customers to decode controller logs #16575

Open jason-da-redpanda opened 9 months ago

jason-da-redpanda commented 9 months ago

Who is this for and what problem do they have today?

1) controller logs are in serde format and need to be decoded to view in text form with https://github.com/redpanda-data/redpanda/tree/dev/tools/offline_log_viewer

When working via a support ticket this means downloading the bundle files.. then manually running the python script ont the controller logs

Or .. asking customer to run the steps manually

rpk debug bundleshould be able to do this via a flag .. at source

2) Customers need to know what they are uploading to redpanda support. e.g their security teams might be concerned and want to know if anything sensitive is in these files and edit accordingly before uploading to redpanda

Once decoded (we should automatically redact any sensitive info from decoded controller logs) ... customers can also double check if anything in the files they do not want to be uploaded to redpanda

What are the success criteria?

rpk debug bundle has option to decode controller files ... (with any sensitive data redacted)

Why is solving this problem impactful?

Quicker turn around in support cases.. e.g customer uploads controllers via bundle.. we extract , run offline viewer etc.. this can be done at source

Additionally .....Some customers want to see what they are uploading to redpanda. controller logs are not in plain text

Additional notes

JIRA Link: CORE-1770

daisukebe commented 5 months ago

The challenge with decoding controller logs at a customer environment is the controller format has been evolving and our offline viewer script doesn't follow enough. That means we may end up collecting an incomplete decoded logs from the customer and we need to ask them to send the raw logs to decode it on our end. This is more painful. For example, the script doesn't support controller snapshot yet. It's only available in Michal's personal branch, https://github.com/mmaslankaprv/redpanda/tree/offline-viewer-ctrl-snapshot.

If we want to decode it during collecting a bundle, the viewer script has to align with the controller format of the version completely, which is going to be another effort.

dotnwat commented 4 months ago

If we want to decode it during collecting a bundle, the viewer script has to align with the controller format of the version completely, which is going to be another effort.

@daisukebe @r-vasquez perhaps the solution here is to package some of these tools like the offline log viewer with the redpanda release, or as a companion package so that they are all versioned together?