grafana / grafana

The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.
https://grafana.com
GNU Affero General Public License v3.0
64.65k stars 12.1k forks source link

Chrome tab OOM on kiosk dashboard after 8+ hours (only after v8.4.3+) #50820

Closed bob454522 closed 8 months ago

bob454522 commented 2 years ago

What happened: A dashboard we keep open on a seperate machine, 24/7 , has started to chrome tab crash with OOM only since upgrading grafana-server FROM v8.4.3 to 8.5.5 (and also occurs with v9.0 beta3 as well).

(in other words, the dashboard mentioned here, had no crash issues with v8.4.3 and prior , but after i upgraded grafana-server to 8.5.5 this chrome crash issue started occurring, so i tried updating to v9.0beta3 and the issue is still occuring) On v8.4.3 and prior, we have kept this dashboard open for weeks (months maybe), without any issues nor crashes. No changes have been made to the dashboard itself.

What you expected to happen: Live Stream Dashboard to display wo crashing chrome as it has for over 1 year+

Anything else we need to know?: The dedicated PC we use as our "dashboard display PC" is running win10, and chrome v87.0.4280.67 - we use Natifier (https://github.com/nativefier/nativefier) so that we can make a "app" out of this dashboard's URL. Nativefier packages static-Chome and a single URL to load on launch, which makes it so that the chrome version stays consistent (ie chrome does not get auto-updated or changed). Nativefier is only relevant in this context in that this issue is occuring with a OLDER version of chrome (v87.0.4280.67). The issue also occurs when i dont use Nativefier and use chrome directly (v102.0.5005.115), and to be clear, this only started with grafana v8.5.5 or later (ie it did not occur with v8.4.3)

(i would assume this is a chrome issue as we run this older chrome version), HOWEVER- i have also tested our same dashboard url on the latest version of chrome, and chrome canary- On both this "dedicated display pc" as well as my own personal, unrelated, win11 PC, and the issue occurs exactly the same on both (but only after several hours of the dashboard being open/displayed, ofcourse)

After 6-10+ hours of the dashboard being open, chrome's page will ether go entirely white OR give the "tab crashed , tab out of memory" message. (in either case you are able to hit reload, and the dashboard comes back up and repeats this 6-10hours later).

NB: this dashboard consists of (exclusively) 12x "Grafana - Live Measurements" live streaming charts of panel type "Time Series" (and 3x single value stat panels as well).

No one interacts with the dashboard, in anyway, its a display / read-only type.

we use this url: http://grafIPaddress:3000/d/BmE0nSa7z/blah-real-time-monitor-map?orgId=1&theme=dark&kiosk&refresh=2m

Environment:

Im not sure if this will be helpful: Snap 1 was taken about 15sec after this dashboard loaded, Snap2 was taken about 45sec after Snap1, Snap3 was taken about 3min after Snap2, and Snap 4 was taken about 25min after snap4:

between snap1 and snap2: image

between snap3 and snap4: image

SNAP 4: image

thanks, (LOVE GRAFANA!!)

hoerup commented 2 years ago

We also have issues with OOM in when using grafana newer than 8.4.3

bob454522 commented 2 years ago

We also have issues with OOM in when using grafana newer than 8.4.3

thanks for confirming this (helps to confirm my own testing).

I can also report that i have now seen the same OOM issues ive described above with "normal" , non-live-streaming panel dashboards (has happened twice). Although it takes much longer for OOM to occur (ie with my live-stream dashboard it usually takes ~8-12hours to OOM, vs with dashboards that do NOT have live-stream panels it may take a 2 or 3 days to OOM).

To be clear- when ever i refer to "live-streaming panels" , im referring to this type of panel(s): https://grafana.com/blog/2021/06/28/new-in-grafana-8.0-streaming-real-time-events-and-data-to-dashboards/ ( btw- what an AMAZING feature and great capability these streaming panels are! i love them):

(LOVE GRAFANA!)

hoerup commented 2 years ago

@bob454522 have you tried whether it has improved with 9.1 ?

hoerup commented 2 years ago

@supilee why was this closed? Is the lead identified and fixed ?

leeoniya commented 2 years ago

we're pretty confident that the linked PR fixes the reported issue, please test with https://grafana.com/docs/grafana/latest/release-notes/release-notes-9-1-1/ and we can reopen if the problem persists.

hoerup commented 2 years ago

Just checked with a colleague - this is not solved with 9.1.1 - it is still consuming a steadily increasing amount of memory

leeoniya commented 2 years ago

@hoerup is this happening in a streaming or non-streaming dashboard? can it be reproduced by doing manual/rapid data refreshes using random walk data from the built-in TestData datasource?

does manually triggering the GC reclaim the memory back to the original size?

image

if it's streaming, is the number of series growing continuously without ever being removed?

memory leaks are unfortunately extremely hard to diagnose and fix without a live reproduction case for us to look at. is it possible to set up a public instance that can replicate the issue for us? us asking users to open up devtools and set up memory profiling, and try different settings, etc is just not a feasible way to work through this :(

github-actions[bot] commented 9 months ago

This issue has been automatically marked as stale because it has not had activity in the last year. It will be closed in 30 days if no further activity occurs. Please feel free to leave a comment if you believe the issue is still relevant. Thank you for your contributions!

github-actions[bot] commented 8 months ago

This issue has been automatically closed because it has not had any further activity in the last 30 days. Thank you for your contributions!