vmware-tanzu-labs / educates-training-platform

A platform for hosting interactive workshop environments in Kubernetes, or on top of a local container runtime.
https://docs.educates.dev
Apache License 2.0
72 stars 18 forks source link

Add CLI command to collect diagnostics information #298

Closed jorgemoralespou closed 4 months ago

jorgemoralespou commented 8 months ago

Is your feature request related to a problem? Please describe.

Sometimes there's a problem and something get stuck and trainingportal or operators don't install/update/delete properly

Describe the solution you'd like

It'll be ideal to get information from the related operators and educates components from the cluster to further do some analysis and report the problem in a nice way. It'll be good if the CLI could collect this information into a file (zip or tar.gz) that can be sent over for diagnosis information

Describe alternatives you've considered

No response

Additional information

No response

jorgemoralespou commented 8 months ago

@billkable Can you add here the list of commands we were talking so that we can use that brainstorm for implementing this?

billkable commented 8 months ago

Can you add here the list of commands we were talking so that we can use that brainstorm for implementing this?

#!/bin/bash

# TODO: Handle API token and ingress domain properly
EDUCATES_API_TOKEN=$1
EDUCATES_LAB_DOMAIN=$2
KUBECONFIG=$3

function get_workshop_list() {
    training_portal_id=$1
    lab_domain=$2

    curl --location -o list-workshops.tmp.json "https://${training_portal_id}.${lab_domain}/workshops/catalog/environments/?sessions=true&state=RUNNING&state=STOPPING" \
        --header "Authorization: Bearer $EDUCATES_API_TOKEN"

    cat list-workshops.tmp.json | jq > list-workshops-${training_portal_id}.json
    rm list-workshops.tmp.json
}

# fetch educates namespaces
kubectl --kubeconfig $KUBECONFIG get namespaces -o yaml > namespaces.yaml

# fetch educates training portal and workshop resources COMMENTED OUT, REDUNDANT TO FOLLOWING AGGREGRATE EDUCATES RESOURCES
# kubectl get workshops -o yaml > workshops.yaml
# kubectl get workshopenvironments -o yaml > workshopenvironments.yaml
# kubectl get workshopallocations -o yaml > workshopallocations.yaml
# kubectl get workshopsessions -o yaml > workshopsessions.yaml
# kubectl get trainingportals -o yaml > trainingportals.yaml

# fetch aggregrate educates resources, including training portal, workshop, and related resources
kubectl --kubeconfig $KUBECONFIG get educates -o yaml -A > educates-resources.yaml

# fetch educates secrets
kubectl --kubeconfig $KUBECONFIG get educates-secrets -o yaml -A > educates-secrets.yaml

# dump logs for all training-portal deployments, along with the list of workshops
for ns in $(kubectl --kubeconfig $KUBECONFIG get namespaces -l training.educates.dev/component=portal -o name| sed "s/namespace\///"); do
    kubectl --kubeconfig $KUBECONFIG logs deployment/training-portal -n $ns --timestamps=true > trainingportal-$ns.log
    get_workshop_list $ns $EDUCATES_LAB_DOMAIN
done

# fetch logs for the manager deploymentments
kubectl --kubeconfig $KUBECONFIG logs deployment/session-manager -n educates --timestamps=true > session-manager.log
kubectl --kubeconfig $KUBECONFIG logs deployment/secrets-manager -n educates --timestamps=true > secrets-manager.log
kubectl --kubeconfig $KUBECONFIG logs deployment/tunnel-manager -n educates --timestamps=true > tunnel-manager.log

# fetch events
kubectl --kubeconfig $KUBECONFIG events -A > events.log # consider filtering namespaces
jorgemoralespou commented 4 months ago

This is solved in the new installer, although some information that needs to be collected that we have recently decided upon might not be collected, so let's revisit this on the new installer, and make sure all the things that need to be collected are there. The experience is there.