elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.63k stars 8.22k forks source link

Kibana status is Yellow - Plugins with degraded status #138680

Open andreluisdebiasi opened 2 years ago

andreluisdebiasi commented 2 years ago

Kibana version: 7.17.4

Elasticsearch version: 7.17.4

Server OS version: Oracle Linux Server release 7.9 - 4.14.35-2047.513.2.3.el7uek.x86_64

Browser version: 103.0.5060.134

Browser OS version: Google Chrome

Original install method (e.g. download page, yum, from source, etc.): tar.gz

Describe the bug: In Stack Monitoring Kibana is showing the following messages: Some plugins may be experiencing issues. Please check the Kibana status page and Kibana status is Yellow with many plugins with degraded status. The environment is working fine, but I would like to resolve these yellow alerts.

Expected behavior: Kibana status green

Screenshots (if relevant): Kibana-Yellow-Status

elasticmachine commented 2 years ago

Pinging @elastic/infra-monitoring-ui (Team:Infra Monitoring UI)

weltenwort commented 2 years ago

Hi @andreluisdebiasi :wave: The cause for the degradation of the plugin status should be written to the Kibana log. Could you check whether there's anything to be found there?

smith commented 2 years ago

Is the issue that these plugins are in a green status but the UI is showing them as yellow? Or is it that they are yellow? I'm having trouble understanding if the Stack Monitoring UI is behaving as it should be based on this report.

matthiasledergerber commented 1 year ago

Same issue. Elastic-Stack 8.8.2. This issue exists since some versions. Three Kibana Nodes. All report the same issue. Not sure why this happens or if there is really an issue.

image

x ID Status Summary
  security 1 service is degraded: taskManager  
  cloudLinks 1 service is degraded: security  
  data 2 services are degraded: taskManager, security  
  encryptedSavedObjects 1 service is degraded: security  
  files 1 service is degraded: security  
  lists 1 service is degraded: security  
  snapshotRestore 1 service is degraded: security  
  telemetry 1 service is degraded: security  
  actions 3 services are degraded: taskManager, encryptedSavedObjects, security  
  dataViewEditor 1 service is degraded: data  
  dataViewFieldEditor 1 service is degraded: data  
  ecsDataQualityDashboard 1 service is degraded: data  
  eventAnnotation 1 service is degraded: data  
  fileUpload 2 services are degraded: data, security  
  filesManagement 1 service is degraded: files  
  licenseManagement 1 service is degraded: telemetry  
  savedObjects 1 service is degraded: data  
  savedSearch 1 service is degraded: data  
  telemetryManagementSection 1 service is degraded: telemetry  
  unifiedFieldList 1 service is degraded: data  
  ingestPipelines 2 services are degraded: fileUpload, security  
  notifications 1 service is degraded: actions  
  savedObjectsTaggingOss 1 service is degraded: savedObjects  
  watcher 2 services are degraded: data, licenseManagement  
  savedObjectsManagement 2 services are degraded: data, savedObjectsTaggingOss  
  savedObjectsTagging 2 services are degraded: savedObjectsTaggingOss, security  
  embeddable 3 services are degraded: data, savedObjectsManagement, savedObjectsTaggingOss  
  globalSearchBar 1 service is degraded: savedObjectsTagging  
  unifiedSearch 2 services are degraded: data, savedObjectsManagement  
  dataViewManagement 5 services are degraded: data, dataViewFieldEditor, dataViewEditor and 2 other(s)  
  imageEmbeddable 3 services are degraded: embeddable, files, security  
  navigation 1 service is degraded: unifiedSearch  
  presentationUtil 2 services are degraded: savedObjects, embeddable  
  triggersActionsUi 5 services are degraded: data, savedObjects, unifiedSearch and 2 other(s)  
  uiActionsEnhanced 1 service is degraded: embeddable  
  controls 5 services are degraded: presentationUtil, savedObjects, embeddable and 2 other(s)  
  embeddableEnhanced 2 services are degraded: embeddable, uiActionsEnhanced  
  expressionError 1 service is degraded: presentationUtil  
  expressionImage 1 service is degraded: presentationUtil  
  expressionMetric 1 service is degraded: presentationUtil  
  expressionRepeatImage 1 service is degraded: presentationUtil  
  expressionRevealImage 1 service is degraded: presentationUtil  
  expressionShape 1 service is degraded: presentationUtil  
  graph 5 services are degraded: data, navigation, savedObjects and 2 other(s)  
  kibanaOverview 2 services are degraded: navigation, dataViewEditor  
  ruleRegistry 3 services are degraded: data, triggersActionsUi, security  
  stackAlerts 4 services are degraded: unifiedSearch, triggersActionsUi, savedObjects and 1 other(s)  
  stackConnectors 2 services are degraded: actions, triggersActionsUi  
  transform 5 services are degraded: data, triggersActionsUi, unifiedSearch and 2 other(s)  
  urlDrilldown 2 services are degraded: embeddable, uiActionsEnhanced  
  visualizations 9 services are degraded: data, navigation, embeddable and 6 other(s)  
  dashboard 12 services are degraded: data, dataViewEditor, embeddable and 9 other(s)  
  expressionGauge 3 services are degraded: visualizations, presentationUtil, data  
  expressionHeatmap 3 services are degraded: visualizations, presentationUtil, data  
  expressionLegacyMetricVis 2 services are degraded: visualizations, presentationUtil  
  expressionMetricVis 2 services are degraded: visualizations, presentationUtil  
  expressionPartitionVis 3 services are degraded: data, visualizations, presentationUtil  
  expressionTagcloud 2 services are degraded: visualizations, presentationUtil  
  expressionXY 3 services are degraded: data, eventAnnotation, visualizations  
  visDefaultEditor 1 service is degraded: visualizations  
  visTypeHeatmap 2 services are degraded: data, visualizations  
  visTypeMarkdown 1 service is degraded: visualizations  
  visTypeMetric 2 services are degraded: data, visualizations  
  visTypeTable 1 service is degraded: visualizations  
  visTypeTagcloud 2 services are degraded: data, visualizations  
  visTypeTimelion 2 services are degraded: visualizations, data  
  visTypeTimeseries 3 services are degraded: data, visualizations, unifiedSearch  
  visTypeVega 2 services are degraded: data, visualizations  
  visTypeVislib 2 services are degraded: data, visualizations  
  visTypeXy 2 services are degraded: visualizations, data  
  dashboardEnhanced 5 services are degraded: dashboard, data, embeddable and 2 other(s)  
  inputControlVis 4 services are degraded: data, visDefaultEditor, visualizations and 1 other(s)  
  lens 20 services are degraded: data, navigation, visualizations and 17 other(s)  
  visTypeGauge 3 services are degraded: data, visualizations, expressionGauge  
  visTypePie 3 services are degraded: data, visualizations, expressionPartitionVis  
  aiops 3 services are degraded: data, lens, unifiedFieldList  
  cases 11 services are degraded: actions, data, embeddable and 8 other(s)  
  discover 12 services are degraded: data, embeddable, navigation and 9 other(s)  
  maps 13 services are degraded: controls, unifiedSearch, lens and 10 other(s)  
  dataVisualizer 9 services are degraded: data, embeddable, discover and 6 other(s)  
  discoverEnhanced 2 services are degraded: embeddable, discover  
  observabilityShared 1 service is degraded: cases  
  reporting 5 services are degraded: data, discover, taskManager and 2 other(s)  
  threatIntelligence 4 services are degraded: cases, data, navigation and 1 other(s)  
  timelines 3 services are degraded: cases, data, security  
  canvas 13 services are degraded: data, embeddable, expressionError and 10 other(s)  
  cloudSecurityPosture 6 services are degraded: navigation, data, unifiedSearch and 3 other(s)  
  exploratoryView 10 services are degraded: cases, data, files and 7 other(s)  
  indexManagement 1 service is degraded: security  
  ml 15 services are degraded: aiops, data, dataVisualizer and 12 other(s)  
  osquery 11 services are degraded: actions, data, discover and 8 other(s)  
  sessionView 3 services are degraded: data, timelines, ruleRegistry  
  indexLifecycleManagement 1 service is degraded: indexManagement  
  kubernetesSecurity 4 services are degraded: data, timelines, ruleRegistry and 1 other(s)  
  observability 13 services are degraded: cases, data, embeddable and 10 other(s)  
  remoteClusters 1 service is degraded: indexManagement  
  rollup 2 services are degraded: indexManagement, visTypeTimeseries  
  cloudDefend 5 services are degraded: navigation, data, unifiedSearch and 2 other(s)  
  crossClusterReplication 2 services are degraded: remoteClusters, indexManagement  
  infra 14 services are degraded: cases, data, discover and 11 other(s)  
  observabilityOnboarding 2 services are degraded: data, observability  
  synthetics 16 services are degraded: actions, cases, embeddable and 13 other(s)  
  apm 17 services are degraded: data, embeddable, infra and 14 other(s)  
  enterpriseSearch 7 services are degraded: security, data, discover and 4 other(s)  
  monitoring 9 services are degraded: data, navigation, observability and 6 other(s)  
  securitySolution 30 services are degraded: actions, cases, cloudDefend and 27 other(s)  
  upgradeAssistant 3 services are degraded: data, security, infra  
  logstash 2 services are degraded: monitoring, security  
  ux 13 services are degraded: data, exploratoryView, triggersActionsUi and 10 other(s)  
  taskManager Task Manager is unhealthy - Reason: setting HealthStatus.Error because of expired cold timestamps  
  licensing License fetched  
  banners All dependencies are available  
  customBranding All dependencies are available  
  features All dependencies are available  
  globalSearch All dependencies are available  
  mapsEms All dependencies are available  
  globalSearchProviders All dependencies are available  
  guidedOnboarding All dependencies are available  
  home All dependencies are available  
  console All dependencies are available  
  grokdebugger All dependencies are available  
  management All dependencies are available  
  painlessLab All dependencies are available  
  searchprofiler All dependencies are available  
  advancedSettings All dependencies are available  
  cloudDataMigration All dependencies are available  
  spaces All dependencies are available  
  eventLog All dependencies are available  
  alerting Alerting is (probably) ready  
  fleet Fleet is available  
  assetManager All dependencies are available  
  bfetch All dependencies are available  
  contentManagement All dependencies are available  
  customIntegrations All dependencies are available  
  esUiShared All dependencies are available  
  expressions All dependencies are available  
  fieldFormats All dependencies are available  
  ftrApis All dependencies are available  
  kibanaReact All dependencies are available  
  kibanaUtils All dependencies are available  
  licenseApiGuard All dependencies are available  
  monitoringCollection All dependencies are available  
  runtimeFields All dependencies are available  
  savedObjectsFinder All dependencies are available  
  screenshotMode All dependencies are available  
  share All dependencies are available  
  translations All dependencies are available  
  unifiedHistogram All dependencies are available  
  urlForwarding All dependencies are available  
  usageCollection All dependencies are available  
  charts All dependencies are available  
  cloud All dependencies are available  
  dataViews All dependencies are available  
  devTools All dependencies are available  
  inspector All dependencies are available  
  kibanaUsageCollection All dependencies are available  
  newsfeed All dependencies are available  
  telemetryCollectionManager All dependencies are available  
  screenshotting All dependencies are available  
  telemetryCollectionXpack All dependencies are available  
  uiActions All dependencies are available
matthiasledergerber commented 1 year ago

In the Kibana.log we do have some errors that come up from time to time.

{"service":{"node":{"roles":["background_tasks","ui"]}},"ecs":{"version":"8.6.1"},"@timestamp":"2023-06-29T20:58:14.179+02:00","message":"[WorkloadAggregator]: ResponseError: search_phase_execution_exception\n\tRoot causes:\n\t\tparse_exception: operator not supported for date math [+12500ms]","log":{"level":"ERROR","logger":"plugins.taskManager"},"process":{"pid":972},"trace":{"id":"905669459182629193f7274c8ac06691"},"transaction":{"id":"cc7a2fdd9f4b73c8"}}
{"service":{"node":{"roles":["background_tasks","ui"]}},"ecs":{"version":"8.6.1"},"@timestamp":"2023-06-29T20:59:14.184+02:00","message":"[WorkloadAggregator]: ResponseError: search_phase_execution_exception\n\tRoot causes:\n\t\tparse_exception: operator not supported for date math [+12500ms]","log":{"level":"ERROR","logger":"plugins.taskManager"},"process":{"pid":972},"trace":{"id":"905669459182629193f7274c8ac06691"},"transaction":{"id":"cc7a2fdd9f4b73c8"}}
{"service":{"node":{"roles":["background_tasks","ui"]}},"ecs":{"version":"8.6.1"},"@timestamp":"2023-06-29T21:00:14.189+02:00","message":"[WorkloadAggregator]: ResponseError: search_phase_execution_exception\n\tRoot causes:\n\t\tparse_exception: operator not supported for date math [+12500ms]","log":{"level":"ERROR","logger":"plugins.taskManager"},"process":{"pid":972},"trace":{"id":"905669459182629193f7274c8ac06691"},"transaction":{"id":"cc7a2fdd9f4b73c8"}}
{"service":{"node":{"roles":["background_tasks","ui"]}},"ecs":{"version":"8.6.1"},"@timestamp":"2023-06-29T21:01:14.194+02:00","message":"[WorkloadAggregator]: ResponseError: search_phase_execution_exception\n\tRoot causes:\n\t\tparse_exception: operator not supported for date math [+12500ms]","log":{"level":"ERROR","logger":"plugins.taskManager"},"process":{"pid":972},"trace":{"id":"905669459182629193f7274c8ac06691"},"transaction":{"id":"cc7a2fdd9f4b73c8"}}
{"service":{"node":{"roles":["background_tasks","ui"]}},"ecs":{"version":"8.6.1"},"@timestamp":"2023-06-29T21:02:14.241+02:00","message":"[WorkloadAggregator]: ResponseError: search_phase_execution_exception\n\tRoot causes:\n\t\tparse_exception: operator not supported for date math [+12500ms]","log":{"level":"ERROR","logger":"plugins.taskManager"},"process":{"pid":972},"trace":{"id":"905669459182629193f7274c8ac06691"},"transaction":{"id":"cc7a2fdd9f4b73c8"}}
{"service":{"node":{"roles":["background_tasks","ui"]}},"ecs":{"version":"8.6.1"},"@timestamp":"2023-06-29T21:03:14.245+02:00","message":"[WorkloadAggregator]: ResponseError: search_phase_execution_exception\n\tRoot causes:\n\t\tparse_exception: operator not supported for date math [+12500ms]","log":{"level":"ERROR","logger":"plugins.taskManager"},"process":{"pid":972},"trace":{"id":"905669459182629193f7274c8ac06691"},"transaction":{"id":"cc7a2fdd9f4b73c8"}}
{"service":{"node":{"roles":["background_tasks","ui"]}},"ecs":{"version":"8.6.1"},"@timestamp":"2023-06-29T21:04:14.250+02:00","message":"[WorkloadAggregator]: ResponseError: search_phase_execution_exception\n\tRoot causes:\n\t\tparse_exception: operator not supported for date math [+12500ms]","log":{"level":"ERROR","logger":"plugins.taskManager"},"process":{"pid":972},"trace":{"id":"905669459182629193f7274c8ac06691"},"transaction":{"id":"cc7a2fdd9f4b73c8"}}
{"service":{"node":{"roles":["background_tasks","ui"]}},"ecs":{"version":"8.6.1"},"@timestamp":"2023-06-29T21:05:14.255+02:00","message":"[WorkloadAggregator]: ResponseError: search_phase_execution_exception\n\tRoot causes:\n\t\tparse_exception: operator not supported for date math [+12500ms]","log":{"level":"ERROR","logger":"plugins.taskManager"},"process":{"pid":972},"trace":{"id":"905669459182629193f7274c8ac06691"},"transaction":{"id":"cc7a2fdd9f4b73c8"}}
{"service":{"node":{"roles":["background_tasks","ui"]}},"ecs":{"version":"8.6.1"},"@timestamp":"2023-06-29T21:06:14.260+02:00","message":"[WorkloadAggregator]: ResponseError: search_phase_execution_exception\n\tRoot causes:\n\t\tparse_exception: operator not supported for date math [+12500ms]","log":{"level":"ERROR","logger":"plugins.taskManager"},"process":{"pid":972},"trace":{"id":"905669459182629193f7274c8ac06691"},"transaction":{"id":"cc7a2fdd9f4b73c8"}}
{"service":{"node":{"roles":["background_tasks","ui"]}},"ecs":{"version":"8.6.1"},"@timestamp":"2023-06-29T21:07:14.264+02:00","message":"[WorkloadAggregator]: ResponseError: search_phase_execution_exception\n\tRoot causes:\n\t\tparse_exception: operator not supported for date math [+12500ms]","log":{"level":"ERROR","logger":"plugins.taskManager"},"process":{"pid":972},"trace":{"id":"905669459182629193f7274c8ac06691"},"transaction":{"id":"cc7a2fdd9f4b73c8"}}
{"service":{"node":{"roles":["background_tasks","ui"]}},"ecs":{"version":"8.6.1"},"@timestamp":"2023-06-29T21:08:14.269+02:00","message":"[WorkloadAggregator]: ResponseError: search_phase_execution_exception\n\tRoot causes:\n\t\tparse_exception: operator not supported for date math [+12500ms]","log":{"level":"ERROR","logger":"plugins.taskManager"},"process":{"pid":972},"trace":{"id":"905669459182629193f7274c8ac06691"},"transaction":{"id":"cc7a2fdd9f4b73c8"}}
{"service":{"node":{"roles":["background_tasks","ui"]}},"ecs":{"version":"8.6.1"},"@timestamp":"2023-06-29T21:09:14.273+02:00","message":"[WorkloadAggregator]: ResponseError: search_phase_execution_exception\n\tRoot causes:\n\t\tparse_exception: operator not supported for date math [+12500ms]","log":{"level":"ERROR","logger":"plugins.taskManager"},"process":{"pid":972},"trace":{"id":"905669459182629193f7274c8ac06691"},"transaction":{"id":"cc7a2fdd9f4b73c8"}}
{"service":{"node":{"roles":["background_tasks","ui"]}},"ecs":{"version":"8.6.1"},"@timestamp":"2023-06-29T21:10:14.277+02:00","message":"[WorkloadAggregator]: ResponseError: search_phase_execution_exception\n\tRoot causes:\n\t\tparse_exception: operator not supported for date math [+12500ms]","log":{"level":"ERROR","logger":"plugins.taskManager"},"process":{"pid":972},"trace":{"id":"905669459182629193f7274c8ac06691"},"transaction":{"id":"cc7a2fdd9f4b73c8"}}
{"service":{"node":{"roles":["background_tasks","ui"]}},"ecs":{"version":"8.6.1"},"@timestamp":"2023-06-29T21:11:14.282+02:00","message":"[WorkloadAggregator]: ResponseError: search_phase_execution_exception\n\tRoot causes:\n\t\tparse_exception: operator not supported for date math [+12500ms]","log":{"level":"ERROR","logger":"plugins.taskManager"},"process":{"pid":972},"trace":{"id":"905669459182629193f7274c8ac06691"},"transaction":{"id":"cc7a2fdd9f4b73c8"}}
{"service":{"node":{"roles":["background_tasks","ui"]}},"ecs":{"version":"8.6.1"},"@timestamp":"2023-06-29T21:12:14.287+02:00","message":"[WorkloadAggregator]: ResponseError: search_phase_execution_exception\n\tRoot causes:\n\t\tparse_exception: operator not supported for date math [+12500ms]","log":{"level":"ERROR","logger":"plugins.taskManager"},"process":{"pid":972},"trace":{"id":"905669459182629193f7274c8ac06691"},"transaction":{"id":"cc7a2fdd9f4b73c8"}}
{"service":{"node":{"roles":["background_tasks","ui"]}},"ecs":{"version":"8.6.1"},"@timestamp":"2023-06-29T21:13:14.291+02:00","message":"[WorkloadAggregator]: ResponseError: search_phase_execution_exception\n\tRoot causes:\n\t\tparse_exception: operator not supported for date math [+12500ms]","log":{"level":"ERROR","logger":"plugins.taskManager"},"process":{"pid":972},"trace":{"id":"905669459182629193f7274c8ac06691"},"transaction":{"id":"cc7a2fdd9f4b73c8"}}
{"service":{"node":{"roles":["background_tasks","ui"]}},"ecs":{"version":"8.6.1"},"@timestamp":"2023-06-29T21:14:14.295+02:00","message":"[WorkloadAggregator]: ResponseError: search_phase_execution_exception\n\tRoot causes:\n\t\tparse_exception: operator not supported for date math [+12500ms]","log":{"level":"ERROR","logger":"plugins.taskManager"},"process":{"pid":972},"trace":{"id":"905669459182629193f7274c8ac06691"},"transaction":{"id":"cc7a2fdd9f4b73c8"}}

This would relate to the image above image

{ "level": "degraded", "summary": "Task Manager is unhealthy - Reason: setting HealthStatus.Error because of expired cold timestamps" }

smith commented 1 year ago

Thanks for posting that @matthiasledergerber. Would you say the Stack Monitoring UI is behaving correctly here? The errors in the logs look like they're coming from a search exception in a background task.

The original issue description says, "The environment is working fine, but I would like to resolve these yellow alerts." Are these valid alerts?

If the Stack Monitoring UI is behaving as expected here, this issue should be closed. If not we should investigate further and make sure the alerts are displaying correctly.

matthiasledergerber commented 1 year ago

I cannot confirm or deny. How can I track down the search that is related to this issue? It seems like the error is cascading through all the other services. I question if this is intended because for the user it looks like Kibana is failing. Furthermore, the error is prominent in the Monitoring overview that something is wrong with Kibana.

weltenwort commented 1 year ago

The specific error message seems to come from the taskManager plugin. Maybe @elastic/response-ops can help with diagnosing the underlying reason?

matthiasledergerber commented 1 year ago

Issue still persists with Kibana 8.10.0

matthiasledergerber commented 1 year ago

related: https://github.com/elastic/kibana/pull/169447