takenet / blip-sdk-csharp

The C# SDK for the blip.ai bot platform
https://blip.ai
Apache License 2.0
32 stars 42 forks source link

feat: add execute script V2 using ClearScript + V8 with HTTP request and context access #210

Closed lucasoares closed 5 months ago

lucasoares commented 5 months ago

Using ClearScript V8 to execute JavaScript as a new action.

It doesn't changes the behavior of the current ExecuteScriptAction.

The new implementation allow users to use recent ECMAScript definitions with way less restrictions.

Additionally, users can execute HTTP request inside the JavaScript with a custom request.fetchAsync API.

The new action also deals with user returns and parse it manually to store in the variable, awaiting any promises used by the JS script.

Code coverage: 87%: image

Benchmark:

Method Mean Error StdDev Gen0 Gen1 Allocated
ExecuteScriptV1SimpleScript 27.90 us 0.170 us 0.159 us 2.8687 0.7324 35.36 KB
ExecuteScriptV1JsonScript 104.94 us 0.208 us 0.162 us 8.0566 2.0752 100.15 KB
ExecuteScriptV1MathScript 288.78 us 0.838 us 0.700 us 20.5078 5.3711 251.85 KB
ExecuteScriptV2SimpleScript 1,435.43 us 11.332 us 10.600 us 3.9063 1.9531 53.99 KB
ExecuteScriptV2JsonScript 1,483.13 us 26.174 us 23.203 us 3.9063 1.9531 53.99 KB
ExecuteScriptV2MathScript 1,527.14 us 21.611 us 20.215 us 3.9063 1.9531 54.03 KB
ExecuteScriptV2LoopScript 1,545.27 us 19.645 us 18.376 us 3.9063 1.9531 54.54 KB
ExecuteScriptV1LoopScript 12,339.65 us 64.323 us 60.168 us 859.3750 203.1250 10697.74 KB

The scripts used for the tests are in here.

I had to remove the limt of max statements to execute the loop script in the V1 version, because it was getting blocked.

Conclusions I made from the benchmark:

In summary I think V2 is more stable than V1 but it always take some time to configure and execute. In the future we may need to investigate what in the ExecuteScriptV2Action is spending more time.

Removing the RegisterFunctions and the result ScriptObjectConverter just to make sure it is not affecting the results:

Method Mean Error StdDev Gen0 Gen1 Allocated
ExecuteScriptV1SimpleScript 28.80 us 0.077 us 0.065 us 2.8687 0.7324 35.36 KB
ExecuteScriptV1JsonScript 101.78 us 1.200 us 1.122 us 8.0566 2.0752 100.15 KB
ExecuteScriptV1MathScript 290.85 us 2.210 us 2.067 us 20.5078 5.3711 251.85 KB
ExecuteScriptV2SimpleScript 1,267.44 us 16.294 us 15.242 us 1.9531 - 26.7 KB
ExecuteScriptV2JsonScript 1,356.35 us 26.873 us 40.223 us 1.9531 - 26.72 KB
ExecuteScriptV2MathScript 1,388.88 us 19.683 us 20.213 us 1.9531 - 26.82 KB
ExecuteScriptV2LoopScript 1,390.48 us 20.609 us 19.277 us 1.9531 - 27.24 KB
ExecuteScriptV1LoopScript 11,653.97 us 125.647 us 117.530 us 859.3750 218.7500 10697.74 KB

For now I think the memory optimizations may have a good impact in our systems, even taking 1~1.5ms more to execute scripts.

Internal Exceptions

The new version have a new setting to capture all exceptions when executing the script.

When capturing exceptions, you may store the exception message in a variable to handle it in the flow.

Our internal exceptions are also captured but the message may be redacted for security reasons. A trace id will be provided in these cases and may be used to get more information about the exception logged by the system.

We also added a warning to the trace execution with the exception message, so you can see the error in the trace (in the Beholder extension when using Blip's Builder, for example).

If the variable name to store exception is not provided, the exception will be captured, the trace will have a warning message and the script will continue to execute normally without setting the variable.

/// <summary>
/// If the script should capture all exceptions instead of throwing them.
/// </summary>
public bool CaptureExceptions { get; set; }

/// <summary>
/// The variable to store the exception message if CaptureExceptions is true.
/// </summary>
public string ExceptionVariable { get; set; }

Features and Limitations

The following documents all the functions available to use inside the JS:

Time Manipulation

Although the new implementation supports more recent time manipulation functions like dateVariable.toLocaleString('pt-BR', { timeZone: 'America/Sao_Paulo' }), we have some limitations with the ExecuteScriptV2Action.

Limitations:

Other functions or even the native Date constructor to parse strings will not use bot's timezone if the date format doesn't includes the timezone in its format. It will use the local server engine timezone by default. We provide a parser helper to overcome that:

Example:

// Instead of this:
const date = new Date('2021-01-01T00:00:10'); // will use server's local timezone
// use this:
const date = time.parseDate('2021-01-01T00:00:10'); // will use bot's timezone or fixed America/Sao_Paulo if not available.

If you want to convert the date to string format with custom timezone, you may use date.toLocaleString from native JS and use the options to configure the timezone yourself, or our helper time.dateToString, both accepts the timeZone as an option.

If you want to parse a string representation of a date that doesn't includes a timezone, but want to specify a custom timezone, use our time.parseDate helper and pass the timeZone in the options.

Date and time helpers:

To configure a bot's timezone, the flow configuration must have the following key: builder:#localTimeZone.

Also, the action settings LocalTimeZoneEnabled must be true.

ParseDate

time.parseDate(string date, object? options)

Function to parse a date string to a DateTime object.

It receives two parameters, with the last one being optional:

time.parseDate returns a JS Date object.

Examples:

const date = time.parseDate('2021-01-01T19:01:01.0000001+08:00');

const date = time.parseDate('01/02/2021', {format: 'MM/dd/yyyy'});

const date = time.parseDate('2021-01-01 19:01:01', {format:'yyyy-MM-dd HH:mm:ss', timeZone: 'America/New_York');

const date = time.parseDate('01/01/2021', {format: 'dd/MM/yyyy', culture: 'pt-BR'});

After parsing the date, it will return a JS Date object, and you are freely to manipulate it as you want.

DateToString

time.dateToString(Date date, object? options)

Function to convert a Date object to a string using the bot's configured timezone or America/Sao_Paulo if not set.

It receives two parameters:

It returns a string.

Examples:

const dateString = time.dateToString(new Date());

const dateString = time.dateToString(time.parseDate('2021-01-01T19:01:01.0000001+08:00'));

const dateString = time.dateToString(new Date(), {format: "yyyy-MM-dd"});

const dateString = time.dateToString(new Date(), {timeZone: "America/New_York"});

Sleep

Additionally, we have a helper function to sleep the script execution for a given amount of time:

time.sleep(int milliseconds)

Use this with caution, as it will block the script execution for the given amount of time and make your bot's execution slower.

Remember that all the scripts have a maximum execution time and may throw an error if the script takes too long to execute.

It receives one parameter:

Examples:

time.sleep(50);

It is useful to force your script to timeout and test how your bot behaves when the script takes too long to execute.

Context

We added some functions to interact with the context variables inside the script.

Set Variable

context.setVariableAsync(string name, object value, TimeSpan? expiration)

Async function to set variable to the context. Should be used in async context.

It receives three parameters, with the last one being optional:

Examples:

async function run() {
    await context.setVariableAsync('myVariable', 'myValue');
    // Variable value: 'myValue'

    await context.setVariableAsync('myVariable', 'myValue', TimeSpan.fromMinutes(5));
    // Variable value: 'myValue'

    await context.setVariableAsync('myVariable', 100, TimeSpan.fromMilliseconds(100));
    // Variable value: '100'

    await context.setVariableAsync('myVariable', {'complex': true}, TimeSpan.fromMilliseconds(100));
    // Variable value: '{"complex":true}'
}

Get Variable

context.getVariableAsync(string name)

Async function to get variable from the context. Should be used in async context.

Returns an empty string if the variable does not exist.

It receives one parameter:

Examples:

async function run() {
    const emptyVariable = await context.getVariableAsync('myVariable');
    // emptyVariable: ''

    await context.setVariableAsync('myVariable', 'myValue');

    const myVariable = await context.getVariableAsync('myVariable');
    // myVariable: 'myValue'
}

Delete Variable

context.deleteVariableAsync(string name)

Async function to delete variable from the context. Should be used in async context.

It receives one parameter:

Examples:

async function run() {
    await context.setVariableAsync('myVariable', 'myValue');

    await context.deleteVariableAsync('myVariable');

    const myVariable = await context.getVariableAsync('myVariable');
    // myVariable: ''
}

HTTP Requests

We added a new fetch API to allow users to make HTTP requests inside the script.

Fetch API

request.fetchAsync(string url, object? options)

An Async function to make HTTP requests inside the script. Should be used in async context.

It receives two parameters, with the last one being optional:

It returns a object with the following properties:

The response also have the jsonAsync() method to parse the body as JSON.

If the request takes too long to execute, it will throw an error according to the script timeout, always remember to handle the errors outside the script in your bot.

Examples:

async function run() {
    const response = await request.fetchAsync('https://jsonplaceholder.typicode.com/todos/1');
    /* response example:
      {
        status: 200,
        headers: { 'key1': ['value1', 'value2'] },
        body: '{"userId": 1, "id": 1, "title": "delectus aut autem", "completed": false}',
        success: true
      }
    */

    const json = await response.jsonAsync();
    // json: { userId: 1, id: 1, title: 'delectus aut autem', completed: false }
}
async function run() {
    // Body will be serialized to JSON automatically
    const response = await request.fetchAsync('https://jsonplaceholder.typicode.com/posts', { method: 'POST', body: { title: 'foo', body: 'bar', userId: 1 } });
    /* response example:
      {
        status: 201,
        headers: { 'key1': ['value1', 'value2'] },
        body: '{"title": "foo", "body": "bar", "userId": 1, "id": 101}',
        success: true
      }
    */

    const json = await response.jsonAsync();
    // json: { title: 'foo', body: 'bar', userId: 1, id: 101 }
}
// More on object manipulation
async function run() {
    const response = await request.fetchAsync('https://jsonplaceholder.typicode.com/todos/1');

    const status = response.status;
    // status: 200

    const success = response.success;
    // success: true

    const headers = response.headers;
    // headers: { 'key1': ['value1', 'value2'] }

    const body = response.body;
    // body: '{"userId": 1, "id": 1, "title": "delectus aut autem", "completed": false}'

    const jsonBody = await response.jsonAsync();
    // jsonBody: { userId: 1, id: 1, title: 'delectus aut autem', completed: false }

    // Header manipulation
    for (const header of response.headers) {
        // do something with header key and value, where value is an array of strings
        // key will always be on lower case
        const key = header.key;
        const values = header.value;

        for (const value of values) {
            // do something with the value
        }
    }

    // Get header with case insensitive key and first value
    // Returns undefined if the header does not exist and an array of strings with the values otherwise
    const contentType = response.getHeader('content-type');
}
fadoaglauss commented 5 months ago

Can you create a test that uses the run as arrow function as below:

async run = () => {
    await context.setVariableAsync('myVariable', 'myValue');

    await context.deleteVariableAsync('myVariable');

    const myVariable = await context.getVariableAsync('myVariable');
    // myVariable: ''
}

image

fadoaglauss commented 5 months ago

Review your entire commit, considering the line size as 120 to make the line break.

fadoaglauss commented 5 months ago

I have a few points about the implementation of the extended functions that I would like discussed:

  1. Today we implemented some validations that prevent the user from using HTTP requests to access some internal servers, for example, localhost. In the current implementation of the HTTP function in ExecuteScritpV2, how can we avoid this?
  2. The context is a complex database that we need to guarantee governance when accessing it, could you create a way to monitor this access when executed through ExecuteScritpV2?
fadoaglauss commented 5 months ago

There is a limit recursion and max statements?

fadoaglauss commented 5 months ago

Please check whether the use of "literal regular expression" is a limitation. image

fadoaglauss commented 5 months ago

Can you check if this still a problem: image

ceifa commented 5 months ago

Javascript doesn't have the "Async" suffix as C# does as a common practice. Would be more "javascriptier" if we could use ".fetch", ".json", and so on.

Also would be great if we had extensions for the bucket, event-trackings and wrappers like context.getContact() for the current contact, context.input, and context.message. @lucasoares are you accepting PRs for your PR?

lucasoares commented 5 months ago

I have a few points about the implementation of the extended functions that I would like discussed:

  1. Today we implemented some validations that prevent the user from using HTTP requests to access some internal servers, for example, localhost. In the current implementation of the HTTP function in ExecuteScritpV2, how can we avoid this?
  2. The context is a complex database that we need to guarantee governance when accessing it, could you create a way to monitor this access when executed through ExecuteScritpV2?
  1. It uses the same IHttpClient of the ProcessHttp action, which will apply all rules the same way

  2. Nice question... All modifications to the variable is doing the same access the execute script already did in the SetScriptResultAsync using the IContext. I don't think there is too much difference right now, but I agree with you, we should have better observability when user is modifying the context variables. We can add trace information for context changes inside the execute script action in the future, this will improve how we monitor these changes. I don't think this PR must have a final version of the implementation, we will launch the execute script v2 as alpha for people to test, builder's team can improve it later as they wish. I think that ClearScript opens a lot of improvements we can do, adding more and more features making c# methods available inside the javascript to have better dev ex when using our systems.

lucasoares commented 5 months ago

Javascript doesn't have the "Async" suffix as C# does as a common practice. Would be more "javascriptier" if we could use ".fetch", ".json", and so on.

Also would be great if we had extensions for the bucket, event-trackings and wrappers like context.getContact() for the current contact, context.input, and context.message. @lucasoares are you accepting PRs for your PR?

I agree with you, adding more features to be accessed inside the Javascript will be great for the future, but I think we should create different PRs for the new features we may want to add, to start testing it with our customers. ClearScript opens a whole new possibilities for us to add features to the javascript. As @fadoaglauss said, we may want to create a pattern for being able to monitor all modifications/calls made by the users inside the javascript first. Feel free to open PRs and we may discuss these features and observability of them there!

About the async, it would be difficult to consider both patterns at the same time. If I want to remove the async suffix inside the javascript, I would need a C# method without the async suffix. I choose to keep c# patterns and then in the javascript will be clearer to the users they are using a async function just looking the name... I would like to keep this pattern to avoid creating c# smells.

lucasoares commented 5 months ago

There is a limit recursion and max statements?

No, there is not. But I personally think for now we don't need it. We should keep monitoring of course, and maybe lower the timeout or heap size limit.

The thing is, based on my benchmark, that Jint consumes too much memory when doing recursion or loops. For example executing this script:

                                       function run() {
                                           let a = 0;

                                           while (a < 1000) {
                                               a++;
                                           }

                                           return a;
                                       }

Will consume 10MB memory on Jint (and 11.6s to execute) but only 54KB on ClearScript + V8 and will take almost the same time to execute as the simplest script used in the benchmark test.

We may use CPU usage to limit our user's scripts in the future, but I don't really know how we would do that. You may want to check this issue: https://github.com/microsoft/ClearScript/issues/578

I just added a recursion benchmark to show how bad our 4-years old outdated Jint deals with it:

1000 calls Method Mean Error StdDev Gen0 Gen1 Allocated
ExecuteScriptV2RecursionLoopScript 1.605 ms 0.0264 ms 0.0247 ms 3.9063 1.9531 56.84 KB
ExecuteScriptV1RecursionLoopScript 32.415 ms 0.2150 ms 0.1795 ms 1812.5000 437.5000 22328.14 KB
10000 calls Method Mean Error StdDev Gen0 Allocated
ExecuteScriptV1RecursionLoopScript NA NA NA NA NA
ExecuteScriptV2RecursionLoopScript 2.396 ms 0.0474 ms 0.0695 ms 3.9063 57.39 KB

Jint just throws exception, even setting 100000000 as the recursion limit.

ceifa commented 5 months ago

About the async, it would be difficult to consider both patterns at the same time. If I want to remove the async suffix inside the javascript, I would need a C# method without the async suffix. I choose to keep c# patterns and then in the javascript will be clearer to the users they are using a async function just looking the name... I would like to keep this pattern to avoid creating c# smells.

Actually you don't need to change the method names, you can just change your attribute loader "LowerCaseMembersLoader". You can add a check if the method returns a Task, remove the "Async" suffix.

But it's okay if your decision is to keep it anyway...

lucasoares commented 5 months ago

Actually you don't need to change the method names, you can just change your attribute loader "LowerCaseMembersLoader". You can add a check if the method returns a Task, remove the "Async" suffix.

But it's okay if your decision is to keep it anyway...

True, I could do it in the LowerCaseMembersLoader loader. But then it would raise another issue when a class have both methods, with and without async suffix... I think we should let it as it is. Thanks for your input tho!