rhaiscript / rhai

Rhai - An embedded scripting language for Rust.
https://crates.io/crates/rhai
Apache License 2.0
3.8k stars 177 forks source link

suggestions for passing registered rust functions as arguments to other registered rust functions #531

Closed jonathanstrong closed 2 years ago

jonathanstrong commented 2 years ago

hello, and thank you again for your hard work on rhai!

I am experimenting with using rhai as a kind of query language to perform data analysis. in this case, all I primarily want rhai to be driving rust code, vs. implementing significant functionality in rhai itself.

I'm having quite a bit of trouble with higher order functions. I was able to get a map function over a collection type working using FnPtr, but the performance was quite bad (3.5sec applying rhai-defined function (x * x) vs. 30ms calling a registered rust function that performs the same work in rust, for an array of around 2 million items).

the technique used there was:

struct MyCollectionWrapper {
    pub fn map(&mut self, context: NativeCallContext, f: FnPtr) -> Self {
        let out: MyCollection = self.xs
            .map(|x| -> f64 {
                f.call_within_context(&context, (x,))
                    .unwrap_or(f64::NAN)
            });
        out.into()
    }
}

// register to `Engine`:

engine
        .register_fn(
            "map",
            |ctx: NativeCallContext, xs: &mut MyCollectionWrapper, f: FnPtr| {
                xs.map(ctx, f)
            },
        );

simplified rhai code:

fn f(x) {
    x * x
}
xs.map(Fn("f"))

what I am trying to do now is pass a registered rust function back to another from rhai:

struct MyCollectionWrapper {
    pub fn resample(&mut self, window: _, agg_fn: fn(&[f64]) -> f64) -> Result<_, _> {
        todo!()
    }
}

fn sum(xs: &[f64]) -> f64 {
    xs.iter().map(|x| *x).sum()
}

// ...

engine.register_fn("mean", mean)
    .register_fn("resample", MyCollectionWrapper::resample);

then in rhai

xs.resample("5min", mean)

...however, mean is not found to be in scope.

It does seem that I am cutting against the grain in terms of rhai's semantics on functions. For me, using higher order functions is critical for what I am trying to do with this, and the work primarily needs to happen in rust context (vs rhai) as it can be significantly faster that way, and also I have a large body of existing functionality in rust.

Do you have any suggestions for techniques that might be used in this situation? the next thing I would try would probably be just passing a &str of the function name and storing a mapping of &str -> fn, but that seems a bit uncouth.

schungx commented 2 years ago

Well, I guess the performance is to be expected if you're running a scripted function 2 million times... That would definitely call for building that functionality in Rust.

In Rhai, native Rust function works exactly the same as scripted functions, and you'd be using them in exactly the same manner. Rhai does not distinguish between Rust and Rhai functions.

struct MyCollectionWrapper {
    pub fn resample(&mut self, ctx: &NativeCallContext, func: &FnPtr, window: &str) -> Result<_, _> {
        todo!()
    }
}

// ...

engine.register_fn("mean", mean)
    .register_result_fn("resample", |ctx: NativeCallContext, col: &mut MyCollectionWrapper, window: &str, func: FnPtr| -> Result<...> {
        col.resample(&ctx, &func, window)
    });

Then in Rhai:

xs.resample("5min", Fn("mean"))

TL;DR

You still have to go through the same function resolution mechanism because Rhai supports function overloading. Essentially, you can map multiple Rust functions into the same name (but different parameter types). Therefore, it is impossible for Rhai to know, in advance, which particular Rust function to call.

If you know exactly which function to map to in each case, then yes, the easiest way is to do it via a match over a &str parameter of the Rust function name to call. However, this is only necessary if 1) you're calling resample a LOT of times, 2) each function call (e.g. mean) runs for a very short time compared to the overhead of resolving the function by Rhai.

Rhai functions resolution is cached so performance is generally quite good; it won't add significant overheads.

schungx commented 2 years ago

Not sure if you're still working on this issue.

I just think of another suggestion: usually, it is idiomatic for functions in Rhai scripts, if they take a closure, to also have a version that takes a string parameter with the name of a function.

That means, in addition to resample that takes a FnPtr, you should also register a version that takes a string, and pass it to the previous function:

fn resample(ctx: NativeCallContext, col: &mut MyCollectionWrapper, window: &str, func: FnPtr| -> Result<...> {
    col.resample(&ctx, &func, window)
}
fn resample_with_fn_name(ctx: NativeCallContext, col: &mut MyCollectionWrapper, window: &str, func: &str) -> Result<...> {
    resample(ctx, col, window, FnPtr::new(func))
}

engine.register_result_fn("resample", resample).register_result_fn("resmple", resample_with_fn_name);

That way, your script can do all of these styles:

xs.resample("5min", "mean")

xs.resample("5min", Fn("mean"))

xs.resample("5min", |x, y| ... )

which is very JavaScript-like...

jonathanstrong commented 2 years ago

I did use a string function name as the solution and it is working pretty well.

Overall I am very psyched about what I have been able to get working in rhai scripting driving fast rust code on big data sets! I love how little friction there is between rust/rhai. It is a very effective combination.

I think, if I were to refocus my initial inquiry, I would put it this way: there seems to be significant function call overhead for using a function passed from rhai into rust. Is that overhead pretty fixed, or is there low-hanging fruit in terms of optimizing it? Because facilitating higher performance there would open a lot of doors for use cases like mine.

In using rhai heavily for the first time over the past week or two, a few other things in the "nice to have" category that I felt the absence of:

(Huge caveat: this list is not intended to be read in a, "nice library, but build this stuff for me, too!" tone! The work you have done in releasing rhai is seriously impressive already. Just figured it would be helpful to hear about a new user's experience for informational purposes).

Loving this awesome tool!

schungx commented 2 years ago

I think, if I were to refocus my initial inquiry, I would put it this way: there seems to be significant function call overhead for using a function passed from rhai into rust. Is that overhead pretty fixed, or is there low-hanging fruit in terms of optimizing it? Because facilitating higher performance there would open a lot of doors for use cases like mine.

Well, I plucked most of the low-hanging fruits already... the issue is with function overloading which mandates that the same function name can operate on parameters of different types (and thus call different Rust functions).

function keyword arguments (could also integrate rhai::Map as f(**obj) potentially)

You mean named arguments? There was an attempt way back (I believe there's still a branch somewhere). However, it makes functions much more brittle as parameter names can no longer be changed without affecting the entire script base. Also, it is difficult to get working with functions overloading.

jupyter kernel

Not sure what this is... I'll need to Google it up!

list comprehensions, and relatedly if Vec and Array were more interchangeable

Yes, always a sore point, in my own projects also. That's why BLOB (which is Vec<u8>) is added because I need to do byte stream manipulations.

Maybe something that can allow Vec<T> to be interchanged with Array (skipping incompatible elements, throwing errors, or using default values)... Let me think about this more. Really what we're looking for is JavaScript's TypedArray.

jonathanstrong commented 2 years ago

Well, I plucked most of the low-hanging fruits already... the issue is with function overloading which mandates that the same function name can operate on parameters of different types (and thus call different Rust functions).

makes sense. perhaps if there were some way to opt in to a more static kind of function call in some way. passing string function names works well, but cuts against the very low friction between rust/rhai in other contexts.

for the sake of additional info on what the scale of this is, I ran map on a list of 12 million f64 with the function f(x) = x*x using both the call_within_context/FnPtr method, vs. passing a string function name and using a native rust function. FnPtr took 14.14s and native rust fn took 0.24s.

You mean named arguments? There was an attempt way back (I believe there's still a branch somewhere). However, it makes functions much more brittle as parameter names can no longer be changed without affecting the entire script base. Also, it is difficult to get working with functions overloading.

I can imagine this would not play nice with overloading. overloading itself provides some of the same benefits, in that you can define versions of the function that provide defaults for parameters that aren't supplied.

in one case I was having trouble mixing up the order of parameters (which have the same type), so I wrote an overload that takes a rhai::Map and plucks the parameters out one by one. It is very nice to use from the rhai side, but pretty tedious to code on the rust side, as you have to handle the case where there is no key, and then also the case where you can't cast the value into the expected type.

jupyter kernel

to add on what you learn from google, jupyter kernel would allow rhai notebooks, which would be awesome. repl++. of course, once it existed, you (er, I) would want a charting library to visualize things, ha.

some other random things that came up:

schungx commented 2 years ago

Simply provide a function called to_string to a type and it will be displayed with this function.

to_debug does the same for debug print.

schungx commented 2 years ago

Is this issue resolved?

jonathanstrong commented 2 years ago

yes - issue is resolved - thanks!