rancher / dashboard

The Rancher UI
https://rancher.com
Apache License 2.0
463 stars 261 forks source link

[2.6.x backport] Reports of browsers running out of memory when tailing log files in UI #8135

Closed gaktive closed 1 year ago

gaktive commented 1 year ago

2.7.x related work: https://github.com/rancher/dashboard/issues/7156 2.7.x PR: https://github.com/rancher/dashboard/pull/7511

Context

Internal reference: SURE-5383 Reported in Rancher 2.6.8.

When troubleshooting an issue involving websockets, I did get a report that was adjacent to it. When using the Vue UI to tail log files, the browser stops responding.

browser memory is consumed until the tab is killed. If you leave one open long enough, it will eventually die. If you pull up pod logs for a busy pod though you can kill it in a few minutes (rancher kubectl UI). We have tried, edge, chrome and brave and they all exhibit the symptom.

Browser tab started at 650mb before opening logs, high cpu usage while streaming them, and memory growing rapidly. Browser tab crashed at about 2.5gb memory footprint.

Workaround: Restart browser tab every few minutes, or sometimes before one minute.

brudnak commented 1 year ago

✅ PASSED

Reproduction Environment

Component Version / Type
Rancher version v2.6.10
Installation option Docker
Cert Details --acme-domain
Docker version 20.10.7, build f0df350
Helm version v2.16.8-rancher2
Downstream cluster type not applicable
Downstream K8s version not applicable
Authentication providers enabled OpenLDAP
Logged in user role administrator, standard user
Browser type google chrome
Browser version 110.0.5481.77 (x86_64)

Reproduction steps

  1. Setup Rancher
  2. Starting from the default Rancher homepage /dashboard/home
  3. Click hamburger menu >>> local >>> Kubectl Shell
  4. Copy the following deployment yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    workload.user.cattle.io/workloadselector: apps.deployment-default-test-logs
  name: test-logs
  namespace: default
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      workload.user.cattle.io/workloadselector: apps.deployment-default-test-logs
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      labels:
        workload.user.cattle.io/workloadselector: apps.deployment-default-test-logs
    spec:
      affinity: {}
      containers:
      - args:
        - 'i=0; while true; do echo "$i: $(date)"; i=$((i+1)); sleep .001; done'
        command:
        - /bin/sh
        - -c
        image: busybox
        imagePullPolicy: Always
        name: fast
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      - args:
        - 'i=0; while true; do echo "$i: $(date)"; i=$((i+1)); sleep 1; done'
        command:
        - /bin/sh
        - -c
        image: busybox
        imagePullPolicy: Always
        name: slower
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      - args:
        - 'i=0; while true; do echo "$i: $(date)"; i=$((i+1)); sleep 10; done'
        command:
        - /bin/sh
        - -c
        image: busybox
        imagePullPolicy: Always
        name: sloooow
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
  1. Paste this into a file in the Kubectl Shell and run it:
vim deploy.yml

# paste above yaml into file and exit vim

kubectl apply -f deploy.yml
  1. Once deployed navigate to local >>> Workload >>> Deployments >>> test-logs
  2. For the pod running in the test-logs deployment, click the ellipsis (three dots) >>> click View Logs
  3. Once you see the logs populating
    • right click chrome/screen
    • click inspect >>> ellipsis (three dots) in chrome >>> More tools >>> Performance monitor
  4. Let this run for ~3 minutes

Additional Info

RESULTS

✅ Expected

For the Rancher UI to continue running without any issues

❌ Actual

The UI became unusable after ~3 minutes.

Metric value
JS heap size 1531 MB
DOM Nodes 620,915

Validation Environment

Component Version / Type
Rancher version v2.6-f4cadcbdb940aede35f038c3168805762a653d4c-head
Installation option Docker
Cert Details --acme-domain
Docker version 20.10.7, build f0df350
Helm version v2.16.8-rancher2
Downstream cluster type not applicable
Downstream K8s version not applicable
Authentication providers enabled OpenLDAP
Logged in user role administrator, standard user
Browser type google chrome
Browser version 110.0.5481.77 (x86_64)

Validation steps

  1. Setup Rancher
  2. Starting from the default Rancher homepage /dashboard/home
  3. Click hamburger menu >>> local >>> Kubectl Shell
  4. Copy the following deployment yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    workload.user.cattle.io/workloadselector: apps.deployment-default-test-logs
  name: test-logs
  namespace: default
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      workload.user.cattle.io/workloadselector: apps.deployment-default-test-logs
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      labels:
        workload.user.cattle.io/workloadselector: apps.deployment-default-test-logs
    spec:
      affinity: {}
      containers:
      - args:
        - 'i=0; while true; do echo "$i: $(date)"; i=$((i+1)); sleep .001; done'
        command:
        - /bin/sh
        - -c
        image: busybox
        imagePullPolicy: Always
        name: fast
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      - args:
        - 'i=0; while true; do echo "$i: $(date)"; i=$((i+1)); sleep 1; done'
        command:
        - /bin/sh
        - -c
        image: busybox
        imagePullPolicy: Always
        name: slower
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      - args:
        - 'i=0; while true; do echo "$i: $(date)"; i=$((i+1)); sleep 10; done'
        command:
        - /bin/sh
        - -c
        image: busybox
        imagePullPolicy: Always
        name: sloooow
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
  1. Paste this into a file in the Kubectl Shell and run it:
vim deploy.yml

# paste above yaml into file and exit vim

kubectl apply -f deploy.yml
  1. Once deployed navigate to local >>> Workload >>> Deployments >>> test-logs
  2. For the pod running in the test-logs deployment, click the ellipsis (three dots) >>> click View Logs
  3. Once you see the logs populating
    • right click chrome/screen
    • click inspect >>> ellipsis (three dots) in chrome >>> More tools >>> Performance monitor
  4. Let this run for ~3 minutes

Additional Info

RESULTS

✅ Expected

For the Rancher UI to continue running without any issues

✅ Actual

No issues with Rancher after ~3 mins and drastically lower metrics

Metric value
JS heap size 167 MB
DOM Nodes 7.634