groundcover-com / murre

Murre is an on-demand, scaleable source of container resource metrics for K8s.
https://www.groundcover.com/blog/murre
Apache License 2.0
310 stars 18 forks source link

empty screen #5

Closed dyipon closed 1 year ago

dyipon commented 1 year ago

hello,

I just installed and started murre, but I just get an empty screen. Whats the best way to debug this issue?

larssb commented 1 year ago

Provide your environment details.

and on you go.


I also had an empty screen initially. Then waited a bit and here it goes ....

Suggestion: Try sizing the screen/window of your shell/terminal and see if that refreshes and gives the content that murre outputs.

dyipon commented 1 year ago

im using latest murre, go version go1.18.3 linux/amd64, and debian/bash

LarsBingBong commented 1 year ago

Did you try my suggestion ... sizing the terminal window?

dyipon commented 1 year ago

tried, did not help :/

ulrichSchreiner commented 1 year ago

hi, i have the same bug. it looks like murre uses ~/.kube/config as default although i have a $KUBECONFIG environment setting. when starting murre with murre --kubeconfig $KUBECONFIG it works

larssb commented 1 year ago

Uuuh that's interesting @ulrichSchreiner - That's the same scenario I'm in ( using --kubeconfig ) - So that's likely why I haven't bumped into it.

maxlevinps commented 1 year ago

hi, i have the same bug. it looks like murre uses ~/.kube/config as default although i have a $KUBECONFIG environment setting. when starting murre with murre --kubeconfig $KUBECONFIG it works

Hey @ulrichSchreiner - We've created a fix for the env support which be added in our next release

@dyipon - Can you please check if @ulrichSchreiner solution works for you?

dyipon commented 1 year ago

@maxlevinps I just updated to 0.0.3, and tried again:

$ murre --kubeconfig ~/.kube/config  

but still nothing, I just get empty screen

ulrichSchreiner commented 1 year ago

do you have a correct (working) ~/.kube/config ? you do not need to specify this config as this is the default.

i use kubie to manage different cluster configs and kubie creates temporary files with a config and sets KUBECONFIG to this file. the people here fixed their code, so with the latest version of murre one can now use the environment setting wthout using the --kubeconfig parameter.

in my case i do have a file ~/.kube/config but it points to a minikube which is not running, so neither kubectl nor murre can connect and hang silently. so can you simple test kubectl get nodes to see if you have a running cluster and your default config works?

dyipon commented 1 year ago

don't know why, but it's started to working.

giuliohome commented 1 year ago

Same issue here: maybe this should be better documented in the readme. In my case I have local minikube I don't understand what I am suppose to do: kubectl shows my pods, murre nothing.

giuliohome commented 1 year ago

Maybe can we reopen the issue and double check what happens in my case too?

[giulio@fedora ~]$ cat /etc/fedora-release
Fedora release 37 (Thirty Seven)
[giulio@fedora ~]$ cat /proc/version
Linux version 6.0.11-300.fc37.x86_64 (mockbuild@bkernel01.iad2.fedoraproject.org) (gcc (GCC) 12.2.1 20221121 (Red Hat 12.2.1-4), GNU ld version 2.38-25.fc37) #1 SMP PREEMPT_DYNAMIC Fri Dec 2 20:47:45 UTC 2022

and

[giulio@fedora ~]$ minikube start
😄  minikube v1.28.0 on Fedora 37
✨  Using the docker driver based on existing profile
👍  Starting control plane node minikube in cluster minikube
🚜  Pulling base image ...
🔄  Restarting existing docker container for "minikube" ...
🐳  Preparing Kubernetes v1.25.3 on Docker 20.10.20 ...
🔎  Verifying Kubernetes components...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
    ▪ Using image docker.io/kubernetesui/dashboard:v2.7.0
    ▪ Using image docker.io/kubernetesui/metrics-scraper:v1.0.8
💡  Some dashboard features require the metrics-server addon. To enable all features please run:

    minikube addons enable metrics-server   

🌟  Enabled addons: default-storageclass, storage-provisioner, dashboard
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
[giulio@fedora ~]$ 

and my pods

[giulio@fedora ~]$ kubectl get pod
NAME                              READY   STATUS    RESTARTS        AGE
gke-golang-web-68d985fb4b-g62j5   1/1     Running   1 (4m25s ago)   112m
gke-golang-web-68d985fb4b-mwrvl   1/1     Running   1 (4m24s ago)   112m
postgres-statefulset-0            1/1     Running   1 (4m24s ago)   3h51m

Finally

[giulio@fedora ~]$ murre --sortby-cpu-utilization

image

giuliohome commented 1 year ago

Tried with Lens, which is very smart and suggested me to install prometheus operator and the metric server

image

Screencast from 2022-12-06 21-58-42.webm

[giulio@fedora ~]$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
"prometheus-community" has been added to your repositories
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "prometheus-community" chart repository
...Successfully got an update from the "bitnami" chart repository
Update Complete. ⎈Happy Helming!⎈
[giulio@fedora ~]$ helm install myprometheus prometheus-community/kube-prometheus-stack
NAME: myprometheus
LAST DEPLOYED: Tue Dec  6 20:48:20 2022
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
  kubectl --namespace default get pods -l "release=myprometheus"

Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.
[giulio@fedora ~]$ helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/
"metrics-server" has been added to your repositories
[giulio@fedora ~]$ helm upgrade --install metrics-server metrics-server/metrics-server
Release "metrics-server" does not exist. Installing it now.
NAME: metrics-server
LAST DEPLOYED: Tue Dec  6 20:53:03 2022
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
***********************************************************************
* Metrics Server                                                      *
***********************************************************************
  Chart version: 3.8.2
  App version:   0.6.1
  Image tag:     k8s.gcr.io/metrics-server/metrics-server:v0.6.1
***********************************************************************
[giulio@fedora ~]$ 
giuliohome commented 1 year ago

I've also enabled the metric server from minikube

[giulio@fedora ~]$ minikube addons enable metrics-server
💡  metrics-server is an addon maintained by Kubernetes. For any concerns contact minikube on GitHub.
You can view the list of minikube maintainers at: https://github.com/kubernetes/minikube/blob/master/OWNERS
    ▪ Using image k8s.gcr.io/metrics-server/metrics-server:v0.6.1
🌟  The 'metrics-server' addon is enabled

but the table of murre is always empty. Is there a way to debug it or a verbose flag? What about reopening this issue to track the problem, at least? I don't think the issue is with my minikube cluster and, as written above, I can even see my CPU metrics/graphs from Lens.

giuliohome commented 1 year ago

I've cloned the repo and built it after adding some fmt.Println to understand the flow. I can't see this in the output, so the issue is that this isn't called

func (t *Table) getCell(stats *k8s.Stats, column int) *tview.TableCell {
        fmt.Println("stats",stats.Namespace, stats.PodName, stats.ContainerName)

even if I see the tick e the output before updateMetric

        fmt.Println("update Metrics!")
        err = m.updateMetrics()

I guess something goes wrong in between.

Thank you for your help!

Edit

getting closer, it looks like the cpu array is empty

output

metrics:  [0xc0000983c0]
node CPU:  [] 2022-12-07 11:41:55.492875508 +0100 CET m=+5.176791128

from

func (m *Murre) updateMetrics() error {
        fmt.Println("here inside the func update Metrics")
        metrics, err := m.fetcher.GetMetrics()
        if err != nil {
                return err
        }
        fmt.Println("metrics: ", metrics)
        for _, node := range metrics {
                fmt.Println("node CPU: ", node.Cpu, node.Timestamp)
                m.updateCpu(node.Cpu, node.Timestamp)
                m.updateMemory(node.Memory, node.Timestamp)
        }
        return nil
}  

But the Fetcher is getting the correct minikube node at least! So it's not a kubeconfig connection issue...

output

Fether is going to get the nodes!!!
nodes [minikube] found

from

func (f *Fetcher) GetMetrics() ([]*NodeMetrics, error) {
        fmt.Println("Fether is going to get the nodes!!!")
        nodes, err := f.getNodes()
        fmt.Println("nodes", nodes, "found")
        if err != nil {
                return nil, err
        }

Furthermore, it prints

fetchMetricsFromNode /api/v1/nodes/%s/proxy/metrics/cadvisor minikube
clientset.RESTClient /api/v1/nodes/minikube/proxy/metrics/cadvisor b [35 32 72 69 76 80 32 99 97 100 118 105 115 111 114 95 118 101 114 115 105 111 110 95 105 110 102 111 32 65 32 109 101 116 114 105 99 32 119 105 116 104 32 97 32 99 111 110 115 116 97 110 116 32 39 49 39 32 118 97 108 117 101 32 108 97 98 101 108 101 100 32 98 121 32 107 101 114 110 101 108 32 118 101 114 11

from

func (f *Fetcher) fetchMetricsFromNode(node string) (*NodeMetrics, error) {
        fetchTime := time.Now()
        path := fmt.Sprintf(CADVISOR_PATH_TEMPLATE, node)
        fmt.Println("fetchMetricsFromNode",CADVISOR_PATH_TEMPLATE, node)
        b, err := f.clientset.RESTClient().Get().AbsPath(path).Do(context.Background()).Raw()
        fmt.Println("clientset.RESTClient",path,"b",b,"to be parsed")

Now I have also double checked that

[giulio@fedora murre]$ kubectl top node
NAME       CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
minikube   1213m        30%    1697Mi          21%       
[giulio@fedora murre]$ kubectl top pod
NAME                                                     CPU(cores)   MEMORY(bytes)   
alertmanager-myprometheus-kube-promethe-alertmanager-0   2m           38Mi            
gke-golang-web-68d985fb4b-g62j5                          1m           5Mi             
gke-golang-web-68d985fb4b-mwrvl                          1m           3Mi             
myprometheus-grafana-7dd5858545-9rl85                    17m          221Mi           
myprometheus-kube-promethe-operator-74b5bd4b97-9csg2     1m           41Mi            
myprometheus-kube-state-metrics-68795b6566-2vt6k         2m           35Mi            
myprometheus-prometheus-node-exporter-d598k              6m           19Mi            
postgres-statefulset-0                                   1m           39Mi            
prometheus-myprometheus-kube-promethe-prometheus-0       74m          333Mi  
giuliohome commented 1 year ago

This is the issue! Would you be so kind to reopen it and consider it a TODO item?

                cpuMetric.CpuUsageSecondsTotal = metric.GetCounter().GetValue()
                if cpuMetric.Name == "" || cpuMetric.PodName == "" || cpuMetric.Namespace == "" {
                        //todo - dont know why this happens
                        fmt.Println("***TODO*** - dont know why this happens")
                        continue
                }
                cpuMetrics = append(cpuMetrics, cpuMetric)

That happens because of multi-container-pods, hence we need to traverse k8s metric API's tree structure, from the pod to the container.