wazuh / wazuh-dashboard-plugins

Plugins for Wazuh Dashboard
https://wazuh.com/
GNU General Public License v2.0
432 stars 177 forks source link

Unhandled promise error when unable to reach Wazuh API #3887

Closed gdiazlo closed 2 years ago

gdiazlo commented 2 years ago
Wazuh Elastic Rev Security
4.3 7.16+ Basic, ODFE, Xpack
Browser
Chrome, Firefox, Safari, etc

Description

We have discovered thanks to https://github.com/elastic/kibana/issues/118536 that we have an unhandled promise when a request to the Wazuh API fails because of a connection error. This can make the process crash depending on the configuration of the node.config as stated here https://github.com/elastic/kibana/blob/main/config/node.options#L9

That configuration is disabled by default in opensearch fork.

Steps to reproduce

  1. Navigate to a dashboard with an API call
  2. Check it works
  3. Take down Wazuh API
  4. Try to call again to the API
  5. check whether the message about an unhandled promise appears in kibana logs.

Expected Result

  1. The kibana server does not crash, and do not have an unhandled promise

Actual Result

  1. Opensearch dashboard crashes with the default config, kibana does not, but the eror about the unhandled promise is shown.

We need to fix all unhandled promises we find.

Desvelao commented 2 years ago

Research

I tried to replicate the reported problem that it states the plugin platform crashed due to the modulesd Wazuh module not working and it causes some API request fails for the statistics job of Wazuh app.

@asteriscos could replicate the problem and we were doing peer to locate and analyze the unhandled promise.

The problem was a supposed logic to build a chain of promises and resolves it with an await. The resolution of the chain of promises was in a try/catch block and it is expected the catch block captures if some promise of the chain was rejected. This is not working as expected due to the method to build the chain of promises was correct.

The try/catch block: https://github.com/wazuh/wazuh-kibana-app/blob/4.3-7.10/server/start/cron-scheduler/scheduler-job.ts#L31-L46

try {
    const hosts = await this.getApiObjects();
    const data = await hosts.reduce(async (acc: Promise<object[]>, host) => {
    const { status } = configuredJobs({ host, jobName: this.jobName })[this.jobName];
    if (!status) return acc;
    const response = await this.getResponses(host);
    const accResolve = await Promise.resolve(acc)
    return [
        ...accResolve,
        ...response,
    ];
    }, Promise.resolve([]));
    !!data.length && await this.saveDocument.save(data, index);
} catch (error) {
    ErrorHandler(error, this.logger);
}

The supposed chain of promises was generated with Array.prototype.reduce, but the reduce function should return directly a promise to the chain of promises is built correctly and can resolve the chain with the await. Instead of some async actions are done and these are not managed causes an error that the catch could capture.

@asteriscos suggested replacing the chain of promises with running them in parallel because: + it is faster than resolving each one synchronously. + we don't see the requirement to run them synchronously. - it increases the number of requests in an interval of time to the API, affecting to the API request limit.

asteriscos commented 2 years ago

After forcing an invalid state in a custom decoder, I was able to replicate the issue. We traced the exception to promises rejected when trying to execute cron-jobs.

Screenshot from 2022-04-12 12-16-55

After analyzing the error handling of reducing the promises, we decided a simultaneous execution of the promises would be faster and more stable. In case one of the promises fails it would not affect the others, thus we took a Promise.all approach to solve the issue.

Desvelao commented 2 years ago

Research

I was researching the configuration of NodeJS with the --unhandled-rejections=warn flag in the node.options file that is interpreted by the bin/opensearch-dashboards executable script for the Wazuh dashboard package. The flag is added to the NODE_OPTIONS variable that NodeJS should take into account.

Causing an unhandled rejection

I edited the server-side code of the Wazuh plugin to cause an unhandled promise and restart the server.

// Added to the `start` function
// /usr/share/wazuh-dashboard/plugins/wazuh/server/plugin.js
// Reject the promise in 5 seconds
(async () => new Promise((res,rej) => setTimeout(rej,5000)))()

I tested without and with the --unhandled-rejections=warn flag and the server crashed anyways due to the unhanded promise.

Digging the cause the process exists

Taking into account the log of the unhandled rejection:

Apr 19 10:06:43 localhost.localdomain opensearch-dashboards[8420]: Unhandled Promise rejection detected:
Apr 19 10:06:43 localhost.localdomain opensearch-dashboards[8420]: undefined
Apr 19 10:06:43 localhost.localdomain opensearch-dashboards[8420]: Terminating process...

I searched it in the OpenSearch Dashboards production code and I found a file where it defines the logging of the unhandled promise and exits the process. https://github.com/opensearch-project/OpenSearch-Dashboards/blob/1.2.0/src/setup_node_env/exit_on_warning.js#L61. There is a handler for unhandledRejection event that contains the logic to exist the node app when one is caught. There is another process handler that could exist the node app. Both process handlers are added when the variable process.noProcessWarnings is not true. https://github.com/opensearch-project/OpenSearch-Dashboards/blob/1.2.0/src/setup_node_env/exit_on_warning.js#L33-L63

The --no-warnings flag should set the process.noProcessWarnings variable to true and the mentioned code block should be ignored. Despite the --no-warnings flag is included in the NODE_OPTIONS environment variable, by some reason I don't know, NodeJS doesn't set the process.noProcessWarnings to true and as a result, the process handlers are added. I am not sure if the other NodeJS options are not applicating as expected.

The OpenSearch Dashboards app is initialized with:

NODE_OPTIONS="--no-warnings --max-http-header-size=65536 $OSD_NODE_OPTS $NODE_OPTIONS" NODE_ENV=production exec "${NODE}" "${DIR}/src/cli/dist" ${@}

I don't know the reason, but in my testings, if the --no-warnings NodeJS option is passed as an option to the binary, this seems to set the process.noProcessWarnigns and as a result ignore to set the 2 process handlers (warning and unhandledRejection). Something like:

NODE_OPTIONS="--no-warnings --max-http-header-size=65536 $OSD_NODE_OPTS $NODE_OPTIONS" NODE_ENV=production exec "${NODE}" "--no-warnings" "${DIR}/src/cli/dist" ${@}

SCRIPT may be an arbitrarily deep series of symlinks. Loop until we have the concrete path.

while [ -h "$SCRIPT" ] ; do ls=$(ls -ld "$SCRIPT")

Drop everything prior to ->

link=$(expr "$ls" : '.-> (.)$') if expr "$link" : '/.*' > /dev/null; then SCRIPT="$link" else SCRIPT=$(dirname "$SCRIPT")/"$link" fi done

DIR="$(dirname "${SCRIPT}")/.." CONFIG_DIR=${OSD_PATH_CONF:-"/etc/wazuh-dashboard"}

if [ -x "${DIR}/node/bin/node" ]; then NODE="${DIR}/node/bin/node" else NODE="$(which node)" fi

if [ ! -x "$NODE" ]; then echo "unable to find usable node.js executable." exit 1 fi

if [ -f "${CONFIG_DIR}/node.options" ]; then OSD_NODE_OPTS="$(grep -v ^# < ${CONFIG_DIR}/node.options | xargs)" fi

NODE_OPTIONS="--no-warnings --max-http-header-size=65536 $OSD_NODE_OPTS $NODE_OPTIONS" NODE_ENV=production exec "${NODE}" "${DIR}/src/cli/dist" ${@}


## Other tests:
### Wazuh dashboard Docker image :heavy_check_mark:
I added the same unhandled promise to the Wazuh server-side code and the server doesn't crash due to it. I checked the value of `process.noProcessWarnings` and it is set to `true` due to the `--no-warnings` option is evaluated correctly.

- Log of the unhandled rejection

wz_indexer-wz_cluster-agent-wazuh-dashboard-1 | {"type":"log","@timestamp":"2022-04-19T10:51:04Z","tags":["warning","environment"],"pid":9,"message":"Detected an unhandled Promise rejection.\nundefined"}


- `bin/opensearch-dashboards` script:
```sh
#!/bin/sh
SCRIPT=$0

# SCRIPT may be an arbitrarily deep series of symlinks. Loop until we have the concrete path.
while [ -h "$SCRIPT" ] ; do
  ls=$(ls -ld "$SCRIPT")
  # Drop everything prior to ->
  link=$(expr "$ls" : '.*-> \(.*\)$')
  if expr "$link" : '/.*' > /dev/null; then
    SCRIPT="$link"
  else
    SCRIPT=$(dirname "$SCRIPT")/"$link"
  fi
done

DIR="$(dirname "${SCRIPT}")/.."
CONFIG_DIR=${OSD_PATH_CONF:-"$DIR/config"}

if [ -x "${DIR}/node/bin/node" ]; then
  NODE="${DIR}/node/bin/node"
else
  NODE="$(which node)"
fi

if [ ! -x "$NODE" ]; then
  echo "unable to find usable node.js executable."
  exit 1
fi

if [ -f "${CONFIG_DIR}/node.options" ]; then
  OSD_NODE_OPTS="$(grep -v ^# < ${CONFIG_DIR}/node.options | xargs)"
fi

NODE_OPTIONS="--no-warnings --max-http-header-size=65536 $OSD_NODE_OPTS $NODE_OPTIONS" NODE_ENV=production exec "${NODE}" "${DIR}/src/cli/dist" ${@}

Kibana Docker image :heavy_check_mark:

I added the same unhandled promise to the Wazuh server-side code and the server doesn't crash. I checked the value of process.noProcessWarnings and it is set to true due to the --no-warnings option is being evaluated correctly.

SCRIPT may be an arbitrarily deep series of symlinks. Loop until we have the concrete path.

while [ -h "$SCRIPT" ] ; do ls=$(ls -ld "$SCRIPT")

Drop everything prior to ->

link=$(expr "$ls" : '.-> (.)$') if expr "$link" : '/.*' > /dev/null; then SCRIPT="$link" else SCRIPT=$(dirname "$SCRIPT")/"$link" fi done

DIR="$(dirname "${SCRIPT}")/.." CONFIG_DIR=${KBN_PATH_CONF:-"$DIR/config"} NODE="${DIR}/node/bin/node" test -x "$NODE" if [ ! -x "$NODE" ]; then echo "unable to find usable node.js executable." exit 1 fi

if [ -f "${CONFIG_DIR}/node.options" ]; then KBN_NODE_OPTS="$(grep -v ^# < ${CONFIG_DIR}/node.options | xargs)" fi

NODE_OPTIONS="--no-warnings --max-http-header-size=65536 $KBN_NODE_OPTS $NODE_OPTIONS" NODE_ENV=production exec "${NODE}" "${DIR}/src/cli/dist" ${@}



# Conclustion
It seems there is some problem in the package or environment of the Wazuh dashboard, that causes the options set in the `NODE_OPTIONS` environment variable are not evaluated as expected.
Desvelao commented 2 years ago

Research

I was trying with a Kibana in production installed in a VM using the all-in-one deployment script and if provokes an unhandled promise, the process exists.

Apr 19 13:11:04 localhost.localdomain kibana[6475]: Unhandled Promise rejection detected:
Apr 19 13:11:04 localhost.localdomain kibana[6475]: undefined
Apr 19 13:11:04 localhost.localdomain kibana[6475]: Terminating process...

But the server restarts due to the Restart setting in the service definition file:

# Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License.
# A copy of the License is located at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# or in the "license" file accompanying this file. This file is distributed
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
# express or implied. See the License for the specific language governing
# permissions and limitations under the License.

# Description:
# Default kibana.service file

[Unit]
Description=Kibana

[Service]
Type=simple
User=kibana
Group=kibana
# Load env vars from /etc/default/ and /etc/sysconfig/ if they exist.
# Prefixing the path with '-' makes it try to load, but if the file doesn't
# exist, it continues onward.
EnvironmentFile=-/etc/default/kibana
EnvironmentFile=-/etc/sysconfig/kibana
ExecStart=/usr/share/kibana/bin/kibana "-c /etc/kibana/kibana.yml"
Restart=always
WorkingDirectory=/

[Install]
WantedBy=multi-user.target

The service definition file for Wazuh dashboard has no the Restart setting. We could be interested in to adding it.

c-bordon commented 2 years ago

Apparently, the problem is that despite the environment variables being loaded, node js does not recognize them, the only way I found to make it work at the moment is:

NODE_OPTIONS="--no-warnings --max-http-header-size=65536 $OSD_NODE_OPTS $NODE_OPTIONS" 
NODE_ENV=production "${NODE}" ${NODE_OPTIONS} "${DIR}/src/cli/dist" ${@}

This is the log with this change:

-- The start-up result is done.
Apr 26 18:53:32 centos7-1 opensearch-dashboards[30397]: { NODE_ENV: 'production',
Apr 26 18:53:32 centos7-1 opensearch-dashboards[30397]: KILL_ON_STOP_TIMEOUT: '0',
Apr 26 18:53:32 centos7-1 opensearch-dashboards[30397]: chroot: '/',
Apr 26 18:53:32 centos7-1 opensearch-dashboards[30397]: SHELL: '/sbin/nologin',
Apr 26 18:53:32 centos7-1 opensearch-dashboards[30397]: nice: '',
Apr 26 18:53:32 centos7-1 opensearch-dashboards[30397]: user: 'wazuh-dashboard',
Apr 26 18:53:32 centos7-1 opensearch-dashboards[30397]: USER: 'wazuh-dashboard',
Apr 26 18:53:32 centos7-1 opensearch-dashboards[30397]: PATH: '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin',
Apr 26 18:53:32 centos7-1 opensearch-dashboards[30397]: PWD: '/usr/share/wazuh-dashboard',
Apr 26 18:53:32 centos7-1 opensearch-dashboards[30397]: LANG: 'en_US.UTF-8',
Apr 26 18:53:32 centos7-1 opensearch-dashboards[30397]: SHLVL: '1',
Apr 26 18:53:32 centos7-1 opensearch-dashboards[30397]: HOME: '/usr/share/wazuh-dashboard/',
Apr 26 18:53:32 centos7-1 opensearch-dashboards[30397]: group: 'wazuh-dashboard',
Apr 26 18:53:32 centos7-1 opensearch-dashboards[30397]: LOGNAME: 'wazuh-dashboard',
Apr 26 18:53:32 centos7-1 opensearch-dashboards[30397]: chdir: '/',
Apr 26 18:53:32 centos7-1 opensearch-dashboards[30397]: _: '/usr/share/wazuh-dashboard/bin/../node/bin/node' }
Apr 26 18:53:32 centos7-1 opensearch-dashboards[30397]: true

This is the log without this change:

Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: { NODE_ENV: 'production',
Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: NODE_OPTIONS:
Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: '--no-warnings --max-http-header-size=65536 --unhandled-rejections=warn ',
Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: KILL_ON_STOP_TIMEOUT: '0',
Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: chroot: '/',
Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: SHELL: '/sbin/nologin',
Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: nice: '',
Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: user: 'wazuh-dashboard',
Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: USER: 'wazuh-dashboard',
Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: PATH: '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin',
Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: PWD: '/usr/share/wazuh-dashboard',
Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: LANG: 'en_US.UTF-8',
Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: SHLVL: '1',
Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: HOME: '/usr/share/wazuh-dashboard/',
Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: group: 'wazuh-dashboard',
Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: LOGNAME: 'wazuh-dashboard',
Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: chdir: '/',
Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: _: '/usr/share/wazuh-dashboard/bin/../node/bin/node' }
Apr 26 18:49:53 centos7-1 opensearch-dashboards[30338]: undefined