opensearch-project / OpenSearch-Dashboards

📊 Open source visualization dashboards for OpenSearch.
https://opensearch.org/docs/latest/dashboards/index/
Apache License 2.0
1.6k stars 821 forks source link

OpenSearch Dashboards failures after upgrade 2.9 to 2.12 #5939

Open rlevytskyi opened 4 months ago

rlevytskyi commented 4 months ago

Dashboards Suddenly Dies Hello OpenSearch Team, We’ve just updated our OpenSearch cluster from version 2.9.0 to 2.12.0. Among other issues, we’ve noticed that Opensearch Dashboards container sometimes get unexpectedly stopped. There is no error message at it’s log but several entries at system log like these (I’ve reduced them slightly):

vm85 dockerd[1011]: msg="ignoring event" container=e490 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
vm85 containerd[902]: msg="shim disconnected" id=e490
vm85 containerd[902]: msg="cleaning up after shim disconnecte d" id=e490 namespace=moby
vm85 containerd[902]: msg="cleaning up dead shim"
vm85 containerd[902]: msg="cleanup warnings time=\"2024-02-23 T14:27:19Z\" level=info msg=\"starting signal loop\" namespace=moby pid=11722 runtime=io.containerd.runc.v2\n" 
vm85 dockerd[1011]: msg="ShouldRestart failed, container will not be restarted" container=e490 daemonShuttingDown=false error="restart c anceled" execDuration=10m7.524639324s exitStatus="{0 2024-02-23 14:27:18.998984252 +0000 UTC}" hasBeenManuallyStopped=true restartCount =4
vm85 containerd[902]: msg="loading plugin \"io.containerd.event.  v1.publisher\"..." runtime=io.containerd.runc.v2 type=io.containerd.event.v1
vm85 containerd[902]: msg="loading plugin \"io.containerd.intern al.v1.shutdown\"..." runtime=io.containerd.runc.v2 type=io.containerd.internal.v1
vm85 containerd[902]: msg="loading plugin \"io.containerd.ttrpc.  v1.task\"..." runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1
vm85 containerd[902]: msg="starting signal loop" namespace=moby path=/run/containerd/io.containerd.runtime.v2.task/moby/e490 pid=11753 runt ime=io.containerd.runc.v2

I managed to fix this by uncommenting and changing the string at the node.options configuration file: --max-old-space-size=6100

My questions are:

To Reproduce Steps to reproduce the behavior:

  1. Open any complex dashboard consisting of multiple items.

Expected behavior In 2.9, our dashboards were rendering properly.

OpenSearch Version 2.12 using Docker image opensearchproject/opensearch:2.12.0

Dashboards Version 2.12 using Docker image opensearchproject/opensearch-dashboards:2.12.0

Plugins Default list that came with distribution.

Screenshots Not applicable.

Host/Environment (please complete the following information):

Additional context No additional context yet.

abbyhu2000 commented 4 months ago

Spike task: look into the performance issue from 2.9 to 2.11. @kavilla @manasvinibs

wbeckler commented 4 months ago

@rlevytskyi would you be willing to share any more details about your settings/plugins/indexes to help us reproduce and diagnose?

rlevytskyi commented 4 months ago

Thank you @wbeckler for your reply! We are running non-dedicated manager cluster, we have four nodes running both data and master-eligible nodes and two coordinating nodes.

wbeckler commented 4 months ago

Is it possible that the memory issue for your data nodes is starving resources from your dashboard containers?

rlevytskyi commented 4 months ago

@wbeckler absolutely no, we have 32GB of RAM for VM running this Dashboards and Coordinating node with 12GB heap. No OOM or something at system logs. Adding some memory to Kibana helped.