RocketChat / Rocket.Chat

The communications platform that puts data protection first.
https://rocket.chat/
Other
40.8k stars 10.72k forks source link

Regular hard crashes since updating to 6.4.1: "FATAL ERROR: v8::ToLocalChecked Empty MaybeLocal" #30633

Open EnCz opened 1 year ago

EnCz commented 1 year ago

Description:

We updated RC from 6.2.0 to 6.4.1 last week and since then we have regular but seemingly random crashes. This is the error message we receive from the rocketchat docker container:

rocketchat            | FATAL ERROR: v8::ToLocalChecked Empty MaybeLocal.
rocketchat            |  1: 0xa3ad50 node::Abort() [node]
rocketchat            |  2: 0x970199 node::FatalError(char const*, char const*) [node]
rocketchat            |  3: 0xbba5fa v8::Utils::ReportApiFailure(char const*, char const*) [node]
rocketchat            |  4: 0x9d1325 node::Environment::CheckImmediate(uv_check_s*) [node]
rocketchat            |  5: 0x13c91a9  [node]
rocketchat            |  6: 0x13c1958 uv_run [node]
rocketchat            |  7: 0xa7b962 node::NodeMainInstance::Run() [node]
rocketchat            |  8: 0xa03a65 node::Start(int, char**) [node]
rocketchat            |  9: 0x7ff3cc77ed0a __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
rocketchat            | 10: 0x98c5bc  [node]

Steps to reproduce:

Not manually reproduceable

Actual behavior:

image

Server Setup Information:

Client Setup Information

Relevant logs:

rocketchat            | FATAL ERROR: v8::ToLocalChecked Empty MaybeLocal.
rocketchat            |  1: 0xa3ad50 node::Abort() [node]
rocketchat            |  2: 0x970199 node::FatalError(char const*, char const*) [node]
rocketchat            |  3: 0xbba5fa v8::Utils::ReportApiFailure(char const*, char const*) [node]
rocketchat            |  4: 0xa539f5  [node]
rocketchat            |  5: 0x18baa35 llhttp__internal_execute [node]
rocketchat            |  6: 0xa54dd2  [node]
rocketchat            |  7: 0xb1cce8 node::LibuvStreamWrap::OnUvRead(long, uv_buf_t const*) [node]
rocketchat            |  8: 0x13ccb47  [node]
rocketchat            |  9: 0x13cd370  [node]
rocketchat            | 10: 0x13d3564  [node]
rocketchat            | 11: 0x13c1948 uv_run [node]
rocketchat            | 12: 0x7fb4f00a5202 Run(napi_env__*, napi_callback_info__*) [/app/bundle/programs/server/npm/node_modules/@kaciras/deasync/build/Release/binding.node]
rocketchat            | 13: 0x9eabad  [node]
rocketchat            | 14: 0xc2679b  [node]
rocketchat            | 15: 0xc27d46  [node]
rocketchat            | 16: 0xc283c6 v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [node]
rocketchat            | 17: 0x1449459  [node]
czepan commented 1 year ago

To add some information:

The issue manifested as follows: We updated from 6.2.0 to 6.4.1, and shortly after the update, the server crashed for the first time with this error. However, it came back up afterward and ran smoothly for 2 days. It was only after those 2 days that it started crashing again, initially every 5 minutes and then more frequently in succession until it eventually crashed instantly every time after restarting the docker container.

We have 4 Apps installed: "poll", "giphy","out-of-office" and "jitsi"

EnCz commented 1 year ago

After much sweat, blood, and tears, I believe I am able to identify the bug. It resides within an incoming integration where we make use of the "HTTP()" function.

The issue arises when a URL is requested, and if it takes too long to respond, it results in a frontend crash.

We had initially used this integration for notifications from our checkMK system, which explains the seemingly sporadic occurrence of the issue.

You can however replicate this problem by creating an incoming webhook with the following Proof of Concept script, which essentially simulates a delayed HTTP request:

class Script {
    process_incoming_request({ request }) {

        HTTP('GET', 'https://hub.dummyapis.com/delay?seconds=10', {
            params: {}
        });

        return null;
    }
}

I was able to reproduce the crash starting from a freshly installed version 6.3.0 and beyond!

During testing, I encountered various error messages on different versions. One example as stated in my original message:

rocketchat            | FATAL ERROR: v8::ToLocalChecked Empty MaybeLocal.
rocketchat            |  1: 0xa3ad50 node::Abort() [node]
rocketchat            |  2: 0x970199 node::FatalError(char const*, char const*) [node]
...

I also encountered another message that hinted at delayed requests due to a "timed out" condition:

rocketchat              | === UnHandledPromiseRejection ===
rocketchat              | timed out
rocketchat              | ---------------------------------
rocketchat              | Errors like this can cause oplog processing errors.
rocketchat              | Setting EXIT_UNHANDLEDPROMISEREJECTION will cause the process to exit allowing your service to automatically restart the process
rocketchat              | Future node.js versions will automatically exit the process
rocketchat              | =================================

In some cases, there were no error messages at all. However, in all instances, the frontend crashed completely, while the Docker container continued to run and remained stuck in this state until manually restarted.

My personal speculation is that this issue may be related to the following change, as shown in the linked image:

image

This change occurred between versions 6.2.12 and 6.3.0, as indicated in this GitHub comparison.

It would be immensely helpful if some of you could confirm this issue with Rocket.Chat versions 6.3.0 or later. I was able to provoke this error on three different instances but would like to ensure its reproducibility.