microsoft / ClearScript

A library for adding scripting to .NET applications. Supports V8 (Windows, Linux, macOS) and JScript/VBScript (Windows).
https://microsoft.github.io/ClearScript/
MIT License
1.75k stars 148 forks source link

Possible memory leak in 6.0.1 #177

Closed ArgonAlex closed 3 years ago

ArgonAlex commented 4 years ago

We upgraded and deployed ClearScript 6.0.1 from 6.0.0 previously, and saw some of our servers run out of memory pretty quickly, while others did not for some reason.

For now we've reverted back to 6.0.0. It's hard to investigate the memory dumps because they are so massive, so sorry I don't have any specifics at this point unfortunately. But I thought it would at least be a good idea to open an issue so you are aware.

I'm yet to reproduce the issue in a development environment, but I haven't had much time to dedicate to investigating.

ClearScriptLib commented 4 years ago

Hi @ArgonAlex,

Can you say anything about how your application uses ClearScript? For example, does it use long-running engine instances? Does it use many instances concurrently, and if so, how does it manage them?

Thanks!

ArgonAlex commented 4 years ago

We use the JsPool library to manage a pool of engines via JavaScriptEngineSwitcher. Each request takes an engine from the pool, does a React server side render, and then returns the engine to the pool.

When I say we upgraded ClearScript, we actually upgraded JavaScriptEngineSwitcher.V8 and JavaScriptEngineSwitcher.V8.Native.win-x64 to 3.5.1

Taritsyn commented 4 years ago

As author of the JavaScript Engine Switcher library, I can only say that the changes in code of the JavaScriptEngineSwitcher.V8 module were minimal. This changes mainly affected to working with the undefined value.

ArgonAlex commented 4 years ago

Thanks for chiming in.

Looking at some server stats, it looks like the memory leak was under the .NET CLR and it was desperately trying to garbage collect. From what I understand about ClearScript, a memory leak in V8 would not be considered as .NET CLR memory usage right? Maybe that helps narrow down the issue.

Taritsyn commented 4 years ago

Your application works under .NET Framework or .NET Core?

ArgonAlex commented 4 years ago

.NET Framework 4.8

Taritsyn commented 4 years ago

I compared the performance of two versions of the JavaScriptEngineSwitcher.V8 module: 3.3.1 and 3.5.1. I took the JsExecutionHeavyBenchmark benchmark as basis.

v3.3.1

| Method | withPrecompilation |     Mean |   Error |  StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|------- |------------------- |---------:|--------:|--------:|------:|------:|------:|----------:|
|     V8 |              False | 115.3 ms | 1.03 ms | 0.91 ms |     - |     - |     - |  94.01 KB |
|     V8 |               True | 109.2 ms | 2.18 ms | 2.04 ms |     - |     - |     - | 103.61 KB |

v3.5.1

| Method | withPrecompilation |     Mean |   Error |  StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|------- |------------------- |---------:|--------:|--------:|------:|------:|------:|----------:|
|     V8 |              False | 115.2 ms | 1.17 ms | 1.04 ms |     - |     - |     - |  95.61 KB |
|     V8 |               True | 107.3 ms | 0.67 ms | 0.63 ms |     - |     - |     - | 105.21 KB |

As you can see, there are no serious changes in managed memory consumption. It is possible that there are leaks of native memory.

ClearScriptLib commented 4 years ago

Hi @ArgonAlex,

Although .NET and V8 have separate heaps and garbage collectors, ClearScript allows script objects to hold strong references to managed resources (and vice versa), effectively linking the runtimes.

The bad news is that the garbage collectors have no information about any external items they may be holding, so they can't prioritize the release of those that may be gumming up the works.

For example, if a tiny script object is holding a giant managed object, V8 will collect it just as lazily as if it were holding nothing at all. And V8 is extremely lazy, always favoring performance over memory consumption.

This is partly why systems like JsPool periodically force garbage collection and/or retire script engines. Does your application customize JsPool, or does it use the default configuration?

Also, can you give us a rough idea of the size of the managed API you're exposing to script code? And approximately how many managed objects might be created during a typical React render? An order of magnitude will do, as any information you provide could help us reproduce the issue.

Cheers!

Taritsyn commented 4 years ago

Current version of the ReactJS.NET library does not use a EmbedHostObject and EmbedHostType methods of IJsEngine interface (in JavaScriptEngineSwitcher.V8 module the AddHostObject and AddHostType methods are called through them). Objects are passed to the script by using JSON serialization.

Taritsyn commented 4 years ago

I added a BenchmarkDotNet.Diagnostics.Windows package to yesterday's benchmark to measure native memory consumption and got the following result:

v3.3.1

| Method | withPrecompilation |     Mean |    Error |  StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated | Allocated native memory | Native memory leak |
|------- |------------------- |---------:|---------:|--------:|------:|------:|------:|----------:|------------------------:|-------------------:|
|     V8 |              False | 108.0 ms | 14.74 ms | 0.81 ms |     - |     - |     - |  92.41 KB |             69287.72 KB |                  - |
|     V8 |               True | 102.7 ms | 44.29 ms | 2.43 ms |     - |     - |     - | 101.74 KB |             68183.15 KB |                  - |

v3.5.1

| Method | withPrecompilation |     Mean |    Error |  StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated | Allocated native memory | Native memory leak |
|------- |------------------- |---------:|---------:|--------:|------:|------:|------:|----------:|------------------------:|-------------------:|
|     V8 |              False | 107.7 ms | 35.38 ms | 1.94 ms |     - |     - |     - |  95.61 KB |             71945.66 KB |                  - |
|     V8 |               True | 101.7 ms | 21.36 ms | 1.17 ms |     - |     - |     - | 105.21 KB |             70835.52 KB |                  - |

There are no leaks, but the consumption of native memory has increased.

ClearScriptLib commented 4 years ago

Thanks for your input, Andrey.

Objects are passed to the script by using JSON serialization.

That's interesting, as it would seem to rule out most of ClearScript's interop machinery.

There are no leaks, but the consumption of native memory has increased.

V8 undergoes huge changes with each major release, and there are two such releases between ClearScript 6.0 and 6.0.1.

Taritsyn commented 4 years ago

That's interesting, as it would seem to rule out most of ClearScript's interop machinery.

It seems that I was mistaken there at all objects are not transferred to the script, but only returned.

ArgonAlex commented 4 years ago

While we aren't using ReactJS.NET, our code is very similar. In our case, we interact with JavaScriptEngineSwitcher roughly like so:

string propsJson = JsonConvert.SerializeObject(props);
using (var timer = new Timer(state => engine.Interrupt(), null, timeout, Timeout.Infinite))
{
    try
    {
        markup = engine.CallFunction<T>("render", componentName, propsJson);
    }
    catch (JsInterruptedException ex)
    {
        throw new TimeoutException("Script execution timed out", ex);
    }
}

and we also occasionally call GetRuntimeHeapInfo() and CollectGarbage(true) on the script engine directly. We do customize JsPool's configuration to have more control.

In regards to native memory increasing, I would have actually expected it to decrease because of pointer compression that was implemented in V8 8.0. According to the V8 team, it decreases heap size by an average of 40%. https://v8.dev/blog/v8-release-80

ClearScriptLib commented 4 years ago

Hi @ArgonAlex,

Hmm, so it's just strings in and out? You're not exposing any managed API for the script to consume?

We do customize JsPool's configuration to have more control.

What customizations are you making? For server applications, we'd definitely recommend periodic garbage collection and engine disposal.

In regards to native memory increasing, I would have actually expected it to decrease because of pointer compression

ClearScript's bundled V8 build has pointer compression disabled.

We did some extensive testing and found that pointer compression caused no significant difference in memory usage with .NET Framework, and significantly worse memory usage with .NET Core. How the choice of .NET runtime could possibly affect V8's unmanaged memory usage is a mystery, but turning off pointer compression eliminated the issue.

Of course, the lack of any benefit from pointer compression may have been due to the specific tests we were using (mostly SunSpider), but the .NET Core issue was a deal breaker, since we ship one V8 build for both runtimes in the NuGet package.

Thanks!

ArgonAlex commented 4 years ago

Hmm, so it's just strings in and out? You're not exposing any managed API for the script to consume?

Correct.

What customizations are you making? For server applications, we'd definitely recommend periodic garbage collection and engine disposal.

Currently, we are running garbage collection after every 500th use of an engine, and disposing of an engine after it has been used 10000 times. I slowly increased these numbers over time to tune it to this point. It's been working well with <=6.0.0 anyway.

According the the heap stats we collect every 100 usages (I could probably make this more frequent), the average total heap size across the pool was normal (~60MB per engine, pretty similar between 6.0.0 and 6.0.1), until JsPool ran out of available engines and the server ran out of memory and stopped responding to requests. I'm not sure about the cause and effect here. I would like to reproduce in a development environment and find out more.

ClearScript's bundled V8 build has pointer compression disabled.

Ah ok I see.

ClearScriptLib commented 4 years ago

Hi @ArgonAlex,

Hmm, so it's just strings in and out? You're not exposing any managed API for the script to consume?

Correct.

Wow, that almost certainly rules out the majority of ClearScript's interop and marshaling facilities.

If you're reusing a pre-parsed render function for each request, then you're definitely doing the right thing. Avoiding redundant invocation of the JavaScript parser is the key to V8 performance.

Still, can you give us a rough idea of the size and complexity of that function?

Currently, we are running garbage collection after every 500th use of an engine, and disposing of an engine after it has been used 10000 times.

Interesting; those settings are very relaxed compared to JsPool's defaults. Could you try dialing back to more aggressive settings?

Also, are you changing the maximum engine count? We recommend keeping it below a small multiple of your server's CPU core count.

JsPool ran out of available engines and the server ran out of memory and stopped responding to requests.

That would seem to imply that all the engines got stuck and couldn't be released to the pool. If that's correct, then it's certainly worth investigating, but it wouldn't explain the memory exhaustion. Or is the server running out of memory due to the growing number of unprocessed requests?

Thanks!

ClearScriptLib commented 3 years ago

If you get a chance, please try ClearScript 7.0 and reopen this issue if the memory leak remains.