microsoft / ClearScript

A library for adding scripting to .NET applications. Supports V8 (Windows, Linux, macOS) and JScript/VBScript (Windows).
https://microsoft.github.io/ClearScript/
MIT License
1.79k stars 149 forks source link

Allow configuration to limit the number of max statements to be executed #578

Closed lucasoares closed 4 months ago

lucasoares commented 6 months ago

Is your feature request related to a problem? Please describe.

I'm migrating from Jint to ClearScript with V8 in my product. I allow my users to configure scripts to be executed in their applications inside my SaaS application and I must run it from my .NET application.

Currently with Jint I'm allowed to make it impossible for users to create infinite loops and malicious usage because I can limit the max number of statements executed by the engine.

Describe the solution you'd like

I would like to have a configuration like the max heap size but to limit how many statements the engine can execute without throwing an exception.

Additional context

Currently I use this on Jint:

var engine = new Engine(options => {

    // Limit memory allocations to 4 MB
    options.LimitMemory(4_000_000);

    // Set a timeout to 4 seconds.
    options.TimeoutInterval(TimeSpan.FromSeconds(4));

    // Set limit of 1000 executed statements. <----------
    options.MaxStatements(1000);

    // Use a cancellation token.
    options.CancellationToken(cancellationToken);
}
ClearScriptLib commented 6 months ago

Hi @lucasoares,

I would like to have a configuration like the max heap size but to limit how many statements the engine can execute without throwing an exception.

Unfortunately, V8 doesn't offer that degree of control over its execution engine. You can, however, limit memory and stack usage, and you can interrupt script execution if it's taking too long.

Sorry!

lucasoares commented 6 months ago

Hello @ClearScriptLib thanks for answering...

Can you think of any workaround for statements? Because user would be able to top my CPU for N seconds until it get timeouted.. If I have a max statements option, I can control it and prevent this kind of usage easier..

I'm really thinking of adding a newline for each line of my user script calling a host method incrementing a counter as workaround HAHAHA but this have so many ways to go wrong...

Or maybe anything related to the debugger to register calls, idk..

ClearScriptLib commented 6 months ago

Can you think of any workaround for statements?

Hmm. Can you clarify what exactly counts as a statement? In order for the limit to be useful, a statement must be something for which execution time has an upper bound or is in some way predictable, right?

lucasoares commented 6 months ago

I don't know how Jint operates it internally, but I this example below it would block the execution even with a single variable instantiation inside the loop.

What I need is basically the ability to limit my users from doing something like:

while(true){
  //Anything fast that will top my CPU
}

This would top one CPU until the timeout.. Another workaround for me would be able to limit the CPU usage by the engine, or anything that gives me control on what the engine do...

Probably everyone allowing users to provide custom JS scripts will need more tools to block people from doing bad scripting...

If you can think for a workaround to allow me at least know better the CPU usage so I can start throttling my user by delaying his script executions, lmk...

ClearScriptLib commented 6 months ago

With jint every statement, variable creation, function call, etc counts as a statement...

Because many JavaScript statements are compound – that is, executing one could involve executing any number of others – it seems likely that what Jint is counting aren't statements but some internal notion of an indivisible operation.

V8's execution engine is entirely different, especially with JIT compilation in effect, so it may not even have such a concept internally. In any case, with its intense focus on performance, V8 provides no way for the host to monitor execution at that level.

This would top one CPU for the duration of the timeout.

Wouldn't it have the same effect as long as the statement count is below the limit?

Probably everyone allowing users to provide custom JS scripts will need more tools to block people from doing bad scripting.

Indeed, running untrusted script code presents a number of challenges, including potential runaway resource usage.

As we mentioned above, you can limit memory and stack usage, and you can interrupt script execution if it's taking too long. A related tool at your disposal is ContinuationCallback. A potential improvement might be to make the continuation interval configurable, as it's currently fixed at two seconds.

Beyond that, we aren't aware of any other V8 hooks we could leverage for resource monitoring.

lucasoares commented 6 months ago

With jint every statement, variable creation, function call, etc counts as a statement...

Because many JavaScript statements are compound – that is, executing one could involve executing any number of others – it seems likely that what Jint is counting aren't statements but some internal notion of an indivisible operation.

V8's execution engine is entirely different, especially with JIT compilation in effect, so it may not even have such a concept internally. In any case, with its intense focus on performance, V8 provides no way for the host to monitor execution at that level.

This would top one CPU for the duration of the timeout.

Wouldn't it have the same effect as long as the statement count is below the limit?

Probably everyone allowing users to provide custom JS scripts will need more tools to block people from doing bad scripting.

Indeed, running untrusted script code presents a number of challenges, including potential runaway resource usage.

As we mentioned above, you can limit memory and stack usage, and you can interrupt script execution if it's taking too long. A related tool at your disposal is ContinuationCallback. A potential improvement might be to make the continuation interval configurable, as it's currently fixed at two seconds.

Beyond that, we aren't aware of any other V8 hooks we could leverage for resource monitoring.

I see.. It's hard to implement any logic in the ContinuationCallback since I can't have metrics about CPU usage or anything related to the performance of the script execution..

I will keep this in mind and let's see how it will perform in the hands of my users HAHAHA You think this is something doable for V8 to implement? Probably not, right?

Thanks!

ClearScriptLib commented 6 months ago

It's hard to implement any logic in the ContinuationCallback since I can't have metrics about CPU usage or anything related to the performance of the script execution..

Well, you might be able to use .NET's diagnostic facilities to track the CPU usage of your script execution thread – assuming that your user scripts are synchronous.

Have a look at the ProcessThread class. You could sample TotalProcessorTime at the start of script execution and periodically thereafter to get a sense of the load the script is putting on the CPU. Of course, the only way a synchronous script wouldn't saturate a core is if it called host methods that spent significant time blocked (waiting for I/O, synchronization, etc.) – or if the system was under CPU-thrashing load to begin with.

You think this is something doable for V8 to implement? Probably not, right?

Unfortunately, that's extremely unlikely. The V8 team does amazing work, but it's focused like a laser on Chromium and to a much lesser extent Node.js. The needs of a small project like ClearScript aren't on their radar, and if the request is for something that could impact performance, forget about it.

lucasoares commented 6 months ago

It's hard to implement any logic in the ContinuationCallback since I can't have metrics about CPU usage or anything related to the performance of the script execution..

Well, you might be able to use .NET's diagnostic facilities to track the CPU usage of your script execution thread – assuming that your user scripts are synchronous.

Have a look at the ProcessThread class. You could sample TotalProcessorTime at the start of script execution and periodically thereafter to get a sense of the load the script is putting on the CPU. Of course, the only way a synchronous script wouldn't saturate a core is if it called host methods that spent significant time blocked (waiting for I/O, synchronization, etc.) – or if the system was under CPU-thrashing load to begin with.

You think this is something doable for V8 to implement? Probably not, right?

Unfortunately, that's extremely unlikely. The V8 team does amazing work, but it's focused like a laser on Chromium and to a much lesser extent Node.js. The needs of a small project like ClearScript aren't on their radar, and if the request is for something that could impact performance, forget about it.

I will check it out! Thank you VERY much for such detailed answer.

lucasoares commented 5 months ago

@ClearScriptLib

Have a look at the ProcessThread class. You could sample TotalProcessorTime at the start of script execution and periodically thereafter to get a sense of the load the script is putting on the CPU.

When I first read your last answer, I thought there was a clearscript class to measure script total processor time, but you were saying to use dotnet's own process thread class, right?

I have thousand of executions in parallel from different scripts and users, if the engine doesn't give me any information about its own usage, I can't do what you suggested haha

Thanks

ClearScriptLib commented 5 months ago

Hi @lucasoares,

When I first read your last answer, I thought there was a clearscript class to measure script total processor time, but you were saying to use dotnet's own process thread class, right?

That's correct.

I have thousand of executions in parallel from different scripts and users,

All in one process? Do your scripts use async host facilities? Or are they just flat-out computations running concurrently on thousands of threads?

if the engine doesn't give me any information about its own usage, I can't do what you suggested haha

Hmm. It would be possible for each V8ScriptEngine instance to track its own kernel/user/total time – something that might be difficult for the host to do if the engine is often re-entered via async promise completion. Would that help?

Thanks!

lucasoares commented 5 months ago

All in one process? Do your scripts use async host facilities? Or are they just flat-out computations running concurrently on thousands of threads?

Yes.. Basically I have a thread pool with scripts being executed, where the number of executions can grow up to our thread pool size for each application instance we have. Each execution will create the engine, evaluate few things and then execute one function defined by our users by calling Invoke.

My application is a chatbot builder where users can trigger a javascript execution when receiving messages, enabling them to compute variables and use these variables in the chatbot's flow.

The scripts may be async (for http requests, for example). What we do is to await for any promises (Tasks, since we use the EnableTaskPromiseConversion flag) generated by the invoke call.

Hmm. It would be possible for each V8ScriptEngine instance to track its own kernel/user/total time – something that might be difficult for the host to do if the engine is often re-entered via async promise completion. Would that help?

If each engine instance provide any cpu-related metric, then I can throttle the number of executions of any specific user I have, or even block his scripts from being executed... But async would definetely bring some complexity...

ClearScriptLib commented 5 months ago

If each engine instance provide any cpu-related metric, then I can throttle the number of executions of any specific user I have, or even block his scripts from being executed...

Can you give us an idea of what a convenient API for this might look like? Would it be similar to the ProcessThread properties, but for an engine rather than a thread?

ClearScriptLib commented 4 months ago

Please feel free to reopen this issue if you have additional thoughts on this topic.