Open jason-da-redpanda opened 9 months ago
The challenge with decoding controller logs at a customer environment is the controller format has been evolving and our offline viewer script doesn't follow enough. That means we may end up collecting an incomplete decoded logs from the customer and we need to ask them to send the raw logs to decode it on our end. This is more painful. For example, the script doesn't support controller snapshot yet. It's only available in Michal's personal branch, https://github.com/mmaslankaprv/redpanda/tree/offline-viewer-ctrl-snapshot.
If we want to decode it during collecting a bundle, the viewer script has to align with the controller format of the version completely, which is going to be another effort.
If we want to decode it during collecting a bundle, the viewer script has to align with the controller format of the version completely, which is going to be another effort.
@daisukebe @r-vasquez perhaps the solution here is to package some of these tools like the offline log viewer with the redpanda release, or as a companion package so that they are all versioned together?
Who is this for and what problem do they have today?
1) controller logs are in serde format and need to be decoded to view in text form with https://github.com/redpanda-data/redpanda/tree/dev/tools/offline_log_viewer
When working via a support ticket this means downloading the bundle files.. then manually running the python script ont the controller logs
Or .. asking customer to run the steps manually
rpk debug bundle
should be able to do this via a flag .. at source2) Customers need to know what they are uploading to redpanda support. e.g their security teams might be concerned and want to know if anything sensitive is in these files and edit accordingly before uploading to redpanda
Once decoded (we should automatically redact any sensitive info from decoded controller logs) ... customers can also double check if anything in the files they do not want to be uploaded to redpanda
What are the success criteria?
rpk debug bundle has option to decode controller files ... (with any sensitive data redacted)
Why is solving this problem impactful?
Quicker turn around in support cases.. e.g customer uploads controllers via bundle.. we extract , run offline viewer etc.. this can be done at source
Additionally .....Some customers want to see what they are uploading to redpanda. controller logs are not in plain text
Additional notes
JIRA Link: CORE-1770