schungx / rhai

Rhai - An embedded scripting language for Rust [dev repo, may regularly force-push, pull from https://github.com/rhaiscript/rhai for a stable build]
https://github.com/rhaiscript/rhai
Apache License 2.0
9 stars 3 forks source link

New Plugins API #3

Closed schungx closed 4 years ago

schungx commented 4 years ago

@jhwgh1968 there has been an open issue regarding the fact that creating an Engine is expensive: https://github.com/jonathandturner/rhai/issues/142

It looks like registering all those core functions (such as basic arithmetic) each time during an Engine creation is causing it to be really slow.

The solution, obviously, is to run it only once (since they never change - all scripts will need them). One solution is to use lazy_static to make this package of functions a global constant, but the downside is that an additional crate must be pulled in: lazy_static.

An alternate solution is to really start breaking down the built-in's into separate packages that are immutable, so the packages themselves can be created only once, stored around, and then passed to all new Engine created.

Something of that sort:

let core_pkg = Library::CoreLib::create();

let mut engine = Engine::new_raw(&[ core_pkg ]);    // Pass in a list of packages to register
       :
let mut engine = Engine::new_raw(&[ core_pkg ]);
let mut engine = Engine::new_raw(&[ core_pkg ]);
       :
   create many engines
jhwgh1968 commented 4 years ago

The version of plugins I am currently working on would solve this problem. It will allow an engine to delegate the lookups until the function is called.

It also will give more convenient Rust syntax for writing a package:

#[rhai::plugin]
mod MyPlugin {
    pub fn export_to_rhai(...) { /* do stuff* / }
    // more pub fns
}

Once defined, all that is needed will be something like

let mut engine = Engine::new();
rhai::register_plugin!(engine, MyModule);

If you are wondering how it works, it uses procedural macros to generate a calling function with a uniform interface.

After the plugin macro has done its work, the module will look like this:

mod MyModule {
    pub fn export_to_rhai(i: INT, f: FLOAT, t: MyStruct) { /* ... * }
    pub fn call<'e>(
        engine: &'e Engine,
        fn_name: &str,
        args: Iterator<Item=Dynamic + 'e>
) -> Result<(), rhai::EvalAltResult> {
        switch fn_name {
            "export_to_rhai" => Ok(export_to_rhai(args.next()?.as_ref().cast::<INT>()?,
                  args.next()?.as_ref().cast::<Float>()?,
                  args.next()?.as_ref().cast::<MyStruct>()?))
            },
            _ => Err(rhai::EvalAltResult::RuntimeError("cannot find function '{}' in 'MyModule'", fn_name),
        }
    }
}

This means that registering a module results in registering one "lookup function", and that is all. Otherwise, it is static code, generated at compile time.

This should be much more efficient, and more convenient for users.

schungx commented 4 years ago

Wow, this is something beyond my expectations. I think we can replace the entire packages implementation with this.

One question: when you register a plugin module, I suppose you register all the functions declared within the same module, right? I don't suppose you can "pick-n-choose"?

You'll also need a way to handle functions that have a first argument that is &mut for update functions. All other arguments must be passed by value.

Also, I'm thinking how this can handle generic functions with multiple data types. I suppose you must have some form of name mangling then... For example, a simple add function can be registered for the whole range of integer data types. If you're simply reflecting the Rust std API, then this is a non-issue because they have different names anyway... However, Rhai supports function overloading.

jhwgh1968 commented 4 years ago

Wow, this is something beyond my expectations. I think we can replace the entire packages implementation with this.

I think we can replace most of it, but there might be some edge cases. See below.

One question: when you register a plugin module, I suppose you register all the functions declared within the same module, right? I don't suppose you can "pick-n-choose"?

In the current prototype, all pub items are exported to Rhai.

It would certainly be possible to add additional attributes to the functions themselves, like #[rhai::export_fn(name = "name_in_rhai")] or #[rhai::export_fn(ignore)]. But I'm focusing on the main cases for now.

You'll also need a way to handle functions that have a first argument that is &mut for update functions. All other arguments must be passed by value.

I would probably have a call_mut_reciever as a separately generated table. After all, what's a little awkwardness if the compiler is doing all the work, right?

Also, I'm thinking how this can handle generic functions with multiple data types. I suppose you must have some form of name mangling then...

Generics are something I am not sure if I can support or not.

The syntax wouldn't be difficult to detect and add, but Rust's type inference may be the limitation due to my reliance on the cast operation for Dynamic.

Let's take your example with plus on numeric types. The broadest way to implement that would be to implement a trait RhaiAddable for all ints, floats, etc, and then:

#[rhai::plugin]
mod numeric_operations {
    #[rhai::reg_op("+")]
    pub fn plus<T: RhaiAddable>(a: T, b: T) -> Result<T, rhai::EvalAltResult> {
        /* achieve this with a trait call or special casing... */
    }
}

As currently written, the auto-generated code would create a call to cast::<T> -- that is, the generic type passed in, rather than any specific type.

Even if T is bounded eventually -- i.e. plus allows clear type inference -- I suspect Rust's type inference would sometimes get confused. There might still have to be some alternative mechanism for generic functions in this case.

Also, please note that my use of reg_op in that example was just to explain it. I have no idea how to approach that, so it might be another special case. :smile:

jhwgh1968 commented 4 years ago

I have opened a PR for the foundational API I need for this work. Hopefully it will be quick to review and merge.

schungx commented 4 years ago

If you can modify your PR to merge into the plugins branch I'll do it right away.

As to generic functions, I think that is the main difference between Rhai and Rust... in this case, Rhai is more like JS. Not only can arguments be of different types, there can also be different number of arguments.

In many Rust std lib cases, multiple functions with different names can be mapped into the same function name in Rhai. I think we need to make the Rust API more "Rhai-centric" by leveraging overloading instead of a simple one-to-one mapping.

As for binding multiple generic versions, maybe the plugin can provide a generic version that an outside macro can simply loop over. For example:

mod MyModule {
    #[rhai(name = "export_to_rhai")]
    #[rhai::reg(i8, f32)]
    #[rhai::reg(i32, f64)]
    pub fn export_to_rhai_3<T: Add, U, V>(i: T, f: U, t: MyStruct) -> String { /* ... * }

    #[rhai(name = "export_to_rhai")]
    #[rhai::reg(char, INT)]
    pub fn export_to_rhai_2<T, U, V>(i: T, f: U) -> bool { /* ... * }

    pub fn call<'e>(
        engine: &'e Engine,
        fn_name: &str,
        args_hash: u64,
        args: Iterator<Item=Dynamic + 'e>
) -> Result<(), rhai::EvalAltResult> {
        switch (fn_name, args.len()) {
            ("export_to_rhai", 3) => {
                  /* Somehow use a hash to map argument types */
                  match args_hash {
                      1234567 => Ok(export_to_rhai_3(args.next()?.as_ref().cast::<i8>()?,
                          args.next()?.as_ref().cast::<f32>()?,
                          args.next()?.as_ref().cast::<MyStruct>()?)),
                      98765 => Ok(export_to_rhai_3(args.next()?.as_ref().cast::<i32>()?,
                          args.next()?.as_ref().cast::<f64>()?,
                          args.next()?.as_ref().cast::<MyStruct>()?)),
                       _ => Err(......   argument types not matched .......),
                  }
           }

           ("export_to_rhai", 2) => ..... ,

            _ => Err(rhai::EvalAltResult::RuntimeError("cannot find function '{}' in 'MyModule'", fn_name),
        }
    }
}
jhwgh1968 commented 4 years ago

If you can modify your PR to merge into the plugins branch I'll do it right away.

I opened a new PR against the plugins branch.

Please make sure to keep that feature branch updated with all your master changes. I'm relying on you to avoid big merge conflicts later.

In many Rust std lib cases, multiple functions with different names can be mapped into the same function name in Rhai. I think we need to make the Rust API more "Rhai-centric" by leveraging overloading instead of a simple one-to-one mapping.

I generally plan to handle overloading and multiple types with the flexibility in that call function, without using Rust generics. Sorry if I got confused by focusing on those.

To go back to a previous example of mine when I was thinking about varadic functions:

let forty_two = sum(40, 2);
let forty_two_again = sum(35, 5, 1, 1);
let forty_two_point_zero = sum(34.0, 6, 0.5, 0.5);

This is the kind of overloading you mean, right?

My plan for that was to make a function attribute that indicated a "varadic" function. If called, the iterator should be passed directly, and let the function itself handle it:

mod MathModule {
    #[rhai::export_fn(name = "sum", varadic)]
    pub fn add_anything(args: impl Iterator<Item=Dynamic + 'e) -> Result<Dynamic, EvalAltResult> {
        let total: FLOAT = 0.0;
        let return_float = false;
        for (n, i) in args.enumerate() {
            if let Some(i32_box) = i.down_cast::<i32>() {
                total += *i32_box as FLOAT;
            } else if let Some(f32_box) = i.down_cast::<f32>() {
                total += *f32_box;
                return_float = true;
            } else if let Some(i64_box) = i.down_cast::<i64>() {
                /* etc ... */
            } else {
                return Err(EvalAltResult::RuntimeError("argument {} is not a float or integer", n));
            }
        }
        Ok(Box::new(if return_float { total } else { total as INT }))
    }
}

I also hope this problem will be somewhat mitigated by broader use of this plugin style. It will hopefully encourage Rhai to develop more impl From<T> for Dynamic impl blocks for different Ts, to allow automatic casting to do more work.

schungx commented 4 years ago

Please make sure to keep that feature branch updated with all your master changes. I'm relying on you to avoid big merge conflicts later.

OK, I'll make sure I keep it up-to-date.

Your idea of keeping all the argument resolution inside the function implementation itself is a good solution to the problem of overloading. However, this puts the burden on the plugin author, who must then use a mass number of .is::<T>() or downcast::<T> calls to detect the correct argument types before dispatching to the correct implementation. This kind of dynamic dispatching is going to make calling each function slower (although I don't have benchmarks to prove that).

One way we can avoid this is to pass in &[Dynamic] instead of an iterator. This way, at least the author knows the number of arguments, and their types (he can run Dynamic.type_id() on each on to get the TypeId. He can then hash these TypeId's to quickly find the correct version of the function to dispatch to, and coerce the correct parameter types.

schungx commented 4 years ago

In a perfect world, we should provide macro facilities to do this kind of dispatching so the author doesn't have to worry about it!

jhwgh1968 commented 4 years ago

Your idea of keeping all the argument resolution inside the function implementation itself is a good solution to the problem of overloading. However, this puts the burden on the plugin author, who must then use a mass number of .is::<T>() or downcast::<T> calls to detect the correct argument types before dispatching to the correct implementation.

I personally don't think this is a significant burden. As shown in my example, I originally came up with this design for algorithms which "fold types together" naturally in the course of their processing. It would be difficult to make sum any shorter using generics if you allowed such type mixing.

That said, I see your concern for functions which are "do the same thing, just accept different types." It may be possible to expand the procedural macro to support this, but my first idea is to expand the types on the Rust side.

For example, take add. It is a binary operation (not varadic, like my sum), and works on ints or floats in any combination.

I would write this definition:

mod CoreMath {
    pub fn add(x: Either<INT, FLOAT>, y: Either<INT, FLOAT>) -> Dynamic {
        match (x, y) {
            (Either::Left(i), Either::Left(j) => Dynamic::from(i + j),
            (Either::Right(i), Either::Right(j) => Dynamic::from(i + j),
            (Either::Left(i), Either::Right(j) => Dynamic::from(i as FLOAT + j),
            (Either::Right(i), Either::Left(j) => Dynamic::from(i + j as FLOAT),
        }
    }
}

Rhai would just need to ensure Dynamic could downcast from an INT or a FLOAT to that Either. (This might be the case already. It is difficult for me to follow the low-level type system APIs you are using.)

This would work with the procedural macro I already have in mind.

This kind of dynamic dispatching is going to make calling each function slower (although I don't have benchmarks to prove that).

My understanding is limited, but the impression I get is that the match shown above and the casting done in sum will be of the same cost after codegen.

Both of them will end up looking like a switch statement in C: choose where to jump based on whether this long integer's value (the type ID or pair of type IDs) is A, B, C, D, or anything else.

Accessing the type ID during the cast shouldn't be slow, either. The functional arguments will surely be in the CPU cache -- after the first conditional check, at the latest.

One way we can avoid this is to pass in &[Dynamic] instead of an iterator. This way, at least the author knows the number of arguments, and their types (he can run Dynamic.type_id() on each on to get the TypeId. He can then hash these TypeId's to quickly find the correct version of the function to dispatch to, and coerce the correct parameter types.

I find this direct use of TypeId unintuitive. I would rather write code which has Rust doing it for me in things like if and match statements.

The reason I picked a type of impl Iterator was because it wasn't clear to me whether you wanted to gather up all the arguments into one place before calling anything. Depending on the size of your Rhai stack, that might be a little slow compared to just writing an iterator to skip to the ones needed.

If you want callers able to know how many arguments they have -- for example, to support "wrong number of arguments" checks early on -- then I would happily change it to an iterator wrapper which implements ExactSizeIterator.

In a perfect world, we should provide macro facilities to do this kind of dispatching so the author doesn't have to worry about it!

My goal is to have procedural macros do as much work as I can. However, I am still figuring things out. A number of your ideas I agree with in general, but I hesitate to make many promises right now.

As things get beyond my first, simplest cases -- currently, all functions accept types passed by value, and return Dynamic -- I will more deeply consider some of your other ideas, and understand what is necessary and possible.

jhwgh1968 commented 4 years ago

Also, a separate question: name spaces.

Currently my Plugin definition requires providing a static name. My original idea was to use this as a namespace. Perhaps something like:

import "crypto";
let encrypted_message = crypto::aes256_block("key", "message");

The use of crypto:: in this code would be the signal to Rhai to find the plugin with the name of crypto, and then invoke MyCryptoPlugin::call (the function generated by the procedural macro) without any regular function lookup steps.

Has any previous work on modules or packages nailed down a syntax I could hook into?

schungx commented 4 years ago

Has any previous work on modules or packages nailed down a syntax I could hook into?

Not really, so you can "modularize" the plugins. However, I'd say let's keep the namespacing/modules syntax in sync so we don't get into a conflict later on. At the least we'd need to support user-defined namespaces/modules which can be imported in the same manner.

And are you going to support importing just one function inside a module?

schungx commented 4 years ago

(This might be the case already. It is difficult for me to follow the low-level type system APIs you are using.)

Unfortunately the particular design used in Rhai depends heavily on dyn Any trait objects, meaning that their types are erased. Just about the only thing you can do is to use is::<T>() or the TypeId to check if it is of a particular type - you need to know the type you're checking in advance.

For example, you won't even know if a dyn Any type is Clone or Copy etc. This means that any use of a type outside of the built-in's in Dynamic (e.g. u16) will be converted into a heap-allocated, boxed trait object of dyn Any with all type info erased except its TypeId.

This is the way Rhai handles custom objects that it doesn't know about.

schungx commented 4 years ago
mod CoreMath {
    pub fn add(x: Either<INT, FLOAT>, y: Either<INT, FLOAT>) -> Dynamic {
        match (x, y) {
            (Either::Left(i), Either::Left(j) => Dynamic::from(i + j),
            (Either::Right(i), Either::Right(j) => Dynamic::from(i + j),
            (Either::Left(i), Either::Right(j) => Dynamic::from(i as FLOAT + j),
            (Either::Right(i), Either::Left(j) => Dynamic::from(i + j as FLOAT),
        }
    }
}

Usually this is enough, if we're only supporting INT and FLOAT without regards to other integer types. So far, most of the use for generics is actually for the wide list of integer types, and things like print, to_string that have the same name working on different types.

to_string is actually a good example. Remember, Rhai functions can be called method-style, meaning you should be able to do:

import aes256_block from "crypto";
let encrypted_message = "key".aes256_block("message");

Meaning that a function call may not always be possible to be qualified with a namespace. Of course, you can force everybody to use namespaced calls instead and disallow method-style, but that would be very un-Rhai...

For a function like to_string, naturally you have tons of implementations, one for each type.

Now, image somebody writes a new plugin handling yet another data type. He/she would obviously want to implement to_string for those types, and be able to just call it postfix-style. Therefore, you STILL need to have a way to resolve the same function name purely based on the parameter types just to handle this kind of overloading.

In your system, everything must be pre-baked-in. That is, you'll have one to_string function with a massive switch statement inside that detects all the types it can handle. If the user adds a new type in a new plugin, the stock to_string implementation won't know about it.

schungx commented 4 years ago

The plugins branch is caught up with the latest changes - nothing touched in the function libraries other than minor bug fixes.

jhwgh1968 commented 4 years ago

a function call may not always be possible to be qualified with a namespace. Of course, you can force everybody to use namespaced calls instead and disallow method-style, but that would be very un-Rhai...

I am currently prototyping the simplest answer to that question: a module can only define receiver methods on types that it exports. When a type is imported by Rhai, all receiver methods come with it.

This is sufficient for my use case, which is about writing types and functions in Rust, and easily exporting them to Rhai. But I recognize that this will not suffice cases where users want to extend the standard library, or do a more dynamic use module::trait the way that Rust can.

In other words, my current prototype allows this Rhai code:

import "crypto";
// use a type in the crypto module, which is now in global namespace
let cipherstate = Aes256::CBC::new("my key", 0 /* the IV */);
let block_one = cipherstate.encrypt("message 1");
block_one /* returns as a string, or perhaps a JS like ArrayBuffer type */

But not something like this, which is what you seem to be describing:

let s = "abc";
let r = s.reverse(); // ERROR: no such method "reverse" on type "string"
import "string_utils" with traits;
let r = s.reverse(); // OK, because the plugin defined "reverse" on &mut String

While it would be easy to write a separate attribute to tie into such syntax, there is the problem of conflicts.

Currently, Rhai plugin loads are presumed infallible. I wasn't going to write them this way at first, but everything they do to the Engine seems to be infallible. It's not clear what happens if two modules try to add a reverse method, for example, to the built-in string.

If there were two calls to register_receiver_fn, with the same name and signature, would the second one panic? Or would it just overwrite the first?

jhwgh1968 commented 4 years ago

In addition, let me go back to to_string for a moment.

The function to_string has an implementation for a lot of different types. As you said, what if a user wanted to print their own type with to_string?

My off-the-cuff answer would be: there is a global to_string the way there is now, and if a new type is defined in a plugin, it could "derive" that one:

module Crypto {
    pub struct CryptoState { /* ... */ }
    impl CryptoState {
        #[rhai::plugin::derive(to_string)

        /* other explicit methods here */
    }
}

The derive would simply cause code something like this to execute:

engine.register_receiver_fn("to_string", |st: &mut CryptoState| {
    // get the current global to_string function
    let dyn_fn_ptr = engine.get_dynamic_fn("to_string");
    //execute it on our parameter, returning the result
    dyn_fn_ptr(Dynamic::from(st))
})

Code achieving that is TBD.

schungx commented 4 years ago

If there were two calls to register_receiver_fn, with the same name and signature, would the second one panic? Or would it just overwrite the first?

I believe the second one overwrites the first right now...

schungx commented 4 years ago

Regarding to_string, currently string handling in Rhai is a bit ad hoc...

For a new type, you need to define to_string, print (which lets it to print), debug (letting it debug print), as well as the +(string, type) and +(type, string) operators to do string concatenation.

Not only that, you'd also want to override Array.push to let it be added into an Array, plus maybe a few other combinations to work with other standard types.

Not the best design, I admit, but that's how it is originally structured. Any ideas on your side to really streamline/automate this would be appreciated!

schungx commented 4 years ago

import "string_utils" with traits; let r = s.reverse(); // OK, because the plugin defined "reverse" on &mut String

Can we just do:

import "string_utils";

if reverse is defined there?

jhwgh1968 commented 4 years ago

Can we just do:

import "string_utils";

if reverse is defined there?

The only reason I wrote that was thinking about Rust's #[macro_export].

They wanted you to opt-in to procedural macros, because the macros would show up in the global namespace. That might cause conflicts you didn't want.

In this case, the conflicts would be between modules, if "traits" were always imported:

import "string_utils"; // contains a reverse method on &mut String
import "string_cesar_cipher"; // also contains a reverse method on &mut String. Oh no!

The with traits is outside of the namespaces proposal, though, since I haven't planned that far ahead.

schungx commented 4 years ago

Good point here, about potential name conflicts. In that case,, even if we force the user to with traits, there is always a chance that the user wants one trait from one module and another trait from another module, which happens to have a conflicting function.

We'd probably need to do something similar to JS:

import reverse from "string_utils";

s.reverse();

or in case of conflicts:

import reverse as reverse1 from "string.utils";
import reverse as reverse2 from "my_string_utils";

s.reverse1();
s1.reverse2();

And if we have this, mind as well open this up to all other imports:


import "crypto";           // crypto::encrypt(...);    import all under namespace
import * from "crypto";     // encrypt(...);   import all under global
import encrypt as en from "crypto";    // en(...);
import { ??? as ???, ???, ??? as ??? } from "????";    // multiple imports
schungx commented 4 years ago

@jhwgh1968 I've kept the plugins branch up-to-date with the modules work.

Now you can program your macros to the modules. Except there is no way to do the with traits thing right now to dynamically add methods to existing types...

Your idea of mapping an import statement to an entry in the current Scope works wonders and basically makes modules relatively free to implement. You have any good ideas on how to do the with traits?

jhwgh1968 commented 4 years ago

The with traits idea is something I have not spent much thought on, I'm afraid. I just got time to review your modules implementation, and that is what I am currently focused on.

My design instinct tells me:

  1. Split trait functions into a separate FunctionsLib in every Module.
  2. Make get_fn and friends take an extra boolean that means "include traits in the search."
  3. Add a boolean to the Stmt::Import enum in the AST, to indicate whether with traits was provided (which is itself only two tokens, or perhaps even one).
  4. Have the engine set the get_fn boolean based on the Stmt::Import flag used on the corresponding module.

Does that help? I know this leaves unanswered questions, such as "how do I know this trait is for another type, rather than my type?" but I am deferring those until you can answer a higher priority question below.


As for the modules implementation, it took me a while to get my head around what I need to do for plugins, but I think I can make progress.

In particular, I hope to commit a basic plugins implementation soon, which can create new Modules and attach them to a StaticModuleResolver.

However, I'm only committing it because I am a big fan of the old saying, "do it first, do it well second." If I were to start on the procedural macro from that PR, it would fail to achieve some of the benefits, and might make some of the features we discussed above more difficult to implement.

Everything I wrote up there relied on my low-overhead lookup strategy: resolve the namespace to a plugin ahead of time, but defer resolving namespace members until a script accesses them. That is what the macro would generate.

Your current modules implementation is still incompatible with that approach. A plugin can now enable the StaticModuleResolver, and then set_fn and set_var for all of its contents; but that will box everything up into Dynamic and Arc<FnDef>.

I think my approach would require writing a PluginModuleResolver, which could perform the namespace lookups in a different way; and that would then conflict with a significant amount of your code.

In the process of writing this post, I now realize I need to separate these two related ideas in my head:

  1. Plugins based on procedural macros help Rhai users with the syntax of adding Rust items into the Rhai namespace.
  2. Plugins based on procedural macros help Rhai users add Rust items into the Rhai namespace with less overhead than was previously possible.

How would you rate those, @schungx? Is one worthwhile without the other? I remember you expressed praise for the procedural macro idea, but it might have just been item 1 you were thinking of.

schungx commented 4 years ago

do it first, do it well second

Wise words indeed! That's how I define "hacking".

I think my approach would require writing a PluginModuleResolver, which could perform the namespace lookups in a different way; and that would then conflict with a significant amount of your code.

I don't think a custom module resolver can help in this case, because it seems like you'd want to lazy-load functions and variables, defer until the first time they're used. Correct? That means that, when the module is attached to an Engine, it doesn't contain anything.

However, currently in the interest of speed, I hash up all the functions and variables in the entire tree once a module is loaded. Doing this has the great advantage of making module-qualified function calls and variable access as fast as normal. However, this also means that you'd need to have all of them ready in the very beginning. Sort of a catch-22...

but defer resolving namespace members until a script accesses them

If this is lazy-loading, I think that's a wonderful idea. Let's not lose it just because of the current architecture. Let's think of a way to have our cake and eat it.

How would you rate those

Well, I must admin I was thinking of No.1. Haven't really thought about No.2 as being possible at all, until you raise it up now. If it is at all possible, then No.2 is clearly the better solution. That would make loading up Engine's extremely fast because none of the built-in functions need to be loaded until they are needed!

However, I would really like to know more about what you're delaying. Which workload is put off until needed? Because I think you'll need a large lookup table just to know the names of all the possible functions to match for, and their parameter types... so essentially you're not loading anything less than what's done currently. You're essentially only saving a boxed function pointer, which is one allocation.

jhwgh1968 commented 4 years ago

If this is lazy-loading, I think that's a wonderful idea. Let's not lose it just because of the current architecture. Let's think of a way to have our cake and eat it.

I found a way. It's in my current PR. I call them "lazy modules", in order to separate them from the modules you previously wrote.

However, I would really like to know more about what you're delaying. Which workload is put off until needed? Because I think you'll need a large lookup table just to know the names of all the possible functions to match for, and their parameter types

The trait that lazy modules implement is:

pub trait LazyModuleDispatcher {
    fn call<'e>(&self,
            fn_name: &str,
            args: Box<dyn Iterator<Item=Dynamic> + 'e>
    ) -> Option<Result<Dynamic, EvalAltResult>>;
}

This trait method is directly called by the Engine from the FnCall operation. It is expected to do the lookup an the invocation all at once. The equivalent work on the other side of the if statement, and all the fn_register calls for the contents, are what is being deferred.

With my #[plugin] procedural macro, the macro will generate a module containing a struct implementing that trait, with all the module's contents tied into it. It would be a large table for something like the Rhai standard library, yes. But the key is: it is in code, rather than a data structure.

The current implementation (shown in my test case) is not the most optimal, I agree. But since it is arbitrary code, it can simply be written to be more efficient once over time.

If you can come up with a template in the procedural macro to do the name and argument hashing on the first call of every module? Great. If Rust's const fn feature becomes more powerful, and the procedural macro can insert a pre-caculated a table? Even better.

Aside from being simpler for me to understand, having a code-based lookup -- call this function, don't worry about what it does -- allows better flexibility and options for optimization in the future.

That is the promise I see, and why I have been pushing on this idea so hard.

schungx commented 4 years ago

But the key is: it is in code, rather than a data structure

A large match statement or a large chain if if ... else if ... matching names and parameter types is probably gonna be slower than a hash lookup.

If you can come up with a template in the procedural macro to do the name and argument hashing on the first call of every module

There is already such a function. It is crate::calc_fn_hash. It is used everywhere to hash functions, modules, and/or parameter types. I usually pre-calculate these hashes into the call site.

There is already code in Modules to index the entire modules chain once it is loaded. The entire modules path plus the function name plus the number arguments are hashed at once, and cached at the module root. This way lookups are extremely fast - they don't have to search deep into the modules structure.

The hash is calculated as two separate steps: the first one with placeholders in types, and the second hash just the argument types. The final hash is the XOR of these two hashes, but this is an implementation detail.

When the time comes do a function call with real parameter types, a pre-calculated hash with placeholder types is XOR'ed with a hash of the real argument types to form the final hash. That has is used to find the function in the modules tree.

If we simply separate the "lazy-loaded" functions from normal ones, we can have a separate function Module::find_lazy_function. We can call that first, and you can return a trait object.

Aside from being simpler for me to understand, having a code-based lookup -- call this function, don't worry about what it does -- allows better flexibility and options for optimization in the future.

This is a great idea. We can simply modify FnAny to be a struct that implements that trait and have all module functions return a trait object. Then use the call API to make the call. Your lazy module does the magic behind.

The more I think about this, the more this is beautiful. We should change FnAny which currently is a type def only holding a boxed function pointer. We can make it struct FnAny that implements Callable with a call API. In this way, we unify all the functions in the whole system in one single blow!

If you look at

https://github.com/schungx/rhai/blob/master/src/engine.rs#L599 https://github.com/schungx/rhai/blob/master/src/engine.rs#L1479

you can see that I basically just call the function. This can easily be replaced by calling a function from a trait object instead, and none of the rest of the code needs to change.

schungx commented 4 years ago

pub trait LazyModuleDispatcher { fn call<'e>(&self, fn_name: &str, args: Box<dyn Iterator + 'e> ) -> Option<Result<Dynamic, EvalAltResult>>; }

Don't forget, not all functions take all value parameters. Some mutate the object (e.g. String::trim), so you need to have a version with &mut parameters.

You can make args: &mut [&'a mut Dynamic] to have the it be a slice of &mut Dynamic instead of Dynamic values, and then deferencing the value parameters with mem::take (yes, you can consume the value parameters). This is how register_fn does it. You don't need to pass an iterator because all the places where functions are called, the arguments are already collected in a slice.

I suggest we make the following changes:

pub trait Callable {
    fn call(&self, args: &mut [&mut Dynamic], pos: Position) -> Result<Dynamic, Box<EvalAltResult>>;
}

pub struct CallableFunction(Box<FnAny>);

impl Callable for CallableFunction {
    fn call(&self, args: &mut [&mut Dynamic], pos: Position) -> Result<Dynamic, Box<EvalAltResult>> {
        self.0(args, pos)
    }
}

pub type NativeFunction = Rc<CallableFunction>;
schungx commented 4 years ago

@jhwgh1968 I have been pondering a question, so maybe you can help give some thoughts.

It is regarding modules, plugins and "packages".

Right now, they have very similar behavior, with only minor differences:

Type Namespace Access Dynamic? Lazy-loaded? Iterators?
Module loaded into imported modules namespace tree mod1::mod2::mod3::func() Yes No No
Plugin loaded into imported modules namespace tree mod1::mod2::mod3::func() Yes Yes No
Package loaded into global namespace func() No No Yes

The code for packages and modules are almost duplicates of each other, as well as the general register_fn methods. They are all duplicates of each other.

Is there any way to unify all these concepts?

jhwgh1968 commented 4 years ago

You don't need to pass an iterator because all the places where functions are called, the arguments are already collected in a slice.

I wanted to preserve the iterator trait on the theory a non-slice implementation might be more efficient eventually. However, your last round of changes convinces me that is over-thinking things. I'll make it a slice.

I mentioned elsewhere that NativeCallable might help with return types, but the lookup part is what I am focused on at the moment with the Dispatcher trait. I will integrate with it later, but it does not help me at the moment.

Is there any way to unify all these concepts?

My hope was plugins would unify them, but that may be more difficult than I thought. Allow me provide some high-level thoughts of my own.

The more I learn about the dynamic casting and function invoking system, the more I think: wow, this is not how I would have done this at all. To be clear, that is intended to be a neutral statement. If I were to try and do it over, I might end up making some of the same decisions, or losing benefits from making others.

It is simply something I have not fully wrapped my head around, and have to re-discover the details of whenever I put this down and pick this up. To explain, let me tell you what background informs my expectations.

I know something about the way the Python 2.x interpreter the Perl 5 interpreter are written. Both of them are fairly simple, but have extensibility as a key tenant. From the inside out, they consist of the following, with fairly clean boundaries in between each piece:

  1. An opaque, flexible type to represent a value in the target language. (Equivalent to Rhai's Dynamic.)
  2. A bunch of utility functions for casting these values between the target language and the source language.
  3. An "interpreter core" which does syntax analysis and execution of a given program in the target language.
  4. A "plugin system" for code in the source language, which code in the target language sees no differently, but the interpreter knows how to call with an escape hatch.
  5. A number of source language plugins, which corresponds to "basic operations" on various types, and glue code to the host OS. It is the backbone of the standard library.
  6. A number of target language modules, which fills out the standard library. Some things are fully native, but those that are not wrap and simplify those plugins.

(Note: all statements below about Rhai are my original impressions, and things may have improved somewhat thanks to our efforts.)

Rhai also consists of these pieces, but it's item 4 that felt furthest from my expectations. Both in the amount of code needed, and the amount of dynamic stuff on the heap required.

In the case of Python and Perl, the source language is C. The code in item 4 can rely on a lot on the C language and compiler to do a lot of work, using type definitions and pointers in clever ways. All functions called in this way are statically defined, since C doesn't have a concept of closures, so it's just a bit of ABI trickery that makes up the special opcode.

The C functions called must them follow an API specification and a template: call this to get you args, call these functions to cast things, do your work, and return a result this way, or throw an exception that way.

That is what I expected Rhai to be, more or less. What to call something natively written in Rust? Just make your Rust function according to a particular template -- even if it's an ugly one -- point Rhai at it, and it will "just work."

Rust, of course, means the rules are different. Weak typing and clever casting are things it does not like to do, as they are sources of UB. However, even despite that, it seems this layered approach is not how Rhai is designed.

Rhai seems to be much more like PHP, where any native code is less like a "module", and more like an extension of the engine's behavior. register_fn did not take a function pointer, for example, but a boxed closure, which it put into the global namespace from a language point of view.

That high-level approach is what I am trying to change at a fundamental level. That is the change I think unlocks the real potential of Rhai to become a small but real language, rather than lightweight way to run simple scripts.

That is why I am thinking about native and "lazy" calls -- things akin to what the Python and Perl interpreters do, rather than things baked into the engine before the script runs. That is the keystone to building out a good user experience with Rust's language tools, like procedural macros to handle all the template generation.

What is not quite clear to me is: what parts of the way Rhai currently works should continue, even in this new paradigm?

Those are my questions at the moment, and why I hesitate to go too far out on a limb. Because even if you like the idea -- and you seem to be warming to it more and more, as we discuss different aspects -- the amount of rework involved is tremendous.

schungx commented 4 years ago

Rhai seems to be much more like PHP, where any native code is less like a "module", and more like an extension of the engine's behavior. register_fn did not take a function pointer, for example, but a boxed closure, which it put into the global namespace from a language point of view.

I suppose the main difference is, since Rust has no GC, and zero-cost abstractions forced the design of closures to be a simple type. So all function pointers must be Boxed closures - since a closure is a type to keep the closed over environment as there is no GC to remove them later on.

That is why I am thinking about native and "lazy" calls -- things akin to what the Python and Perl interpreters do, rather than things baked into the engine before the script runs. That is the keystone to building out a good user experience with Rust's language tools, like procedural macros to handle all the template generation.

The way to make it lazy is to 1) compile a plugin as a C shared library (cdylib) written to a C API, 2) dynamically load the shared lib from Rhai on demand. 3) call the function.

That would make Rhai able to dynamically load separate plugin files (similar to how DLL's are used in the Windows world). My own project has a mechanism like this and it works well, although I haven't used it to dynamically load plugins into Rhai... it is used in my project to load device drivers.

Is the hashed lookup of names an implementation detail, or is that part of the API for the template?

I'd say it is an implementation detail and not absolutely necessary. However, with the namespace of functions getting large, we'll need an efficient way to lookup a particular function short of a huge match statement that compares strings one by one. Don't forget since Rhai has function overloading, you have to check argument types as well. Then the fact that all operations convert into function calls means there are a lot of function calls.

Not to mention functions buried deep under a modules tree (when your plugins work is done, I'd suppose we'll have an explosion of the number of modules).

So the hashing is simply an optimization mechanism because it can be pre-computed during parse time.

Are the different access behaviors between the table items actually features, or could they all be reduced to one kind of thing?

I think they can all be reduced to the same codebase. I already junked the packages code and re-implemented packages on top of modules. Now there are only modules. In the future, when you're done, I anticipate moving modules to your plugins.

What value is there in extending the engine versus writing a plugin? What APIs should be on which boundaries?

Extend the engine when you know the feature will always be there. It'll be compiled in, inlined and optimized by Rust. The result is one single executable file.

A plugin, if it resides in the same main app program and compiled in, is really no different from an engine extension.

A plugin, if residing in an external file (e.g. a cdylib, .dll or .so shared library), can be hot-loaded and can be discovered during runtime. Now that's a completely different level of capabilities. I believe that's what you're thinking of for your plugins system, right?

The entire language library can simply be provided as a huge collection of .so or .dll files and dynamically (lazy) loaded when used.

schungx commented 4 years ago

My suggestion is this:

First, there must be a way to load a plugin as a module. You can leverage the existing modules resolution mechanism, maybe introducing a PluginsModuleResolver. There are the dylib and libloading crates that do this loading.

import "std/core/string" as str;   // load std/core/string.so from file-system

The Engine then keeps a cache of hot-loaded plugin's. If it is already loaded, it won't load it again.

If it is not yet loaded, it first loads the C shared library file from disk.

It doesn't have to be a C library.. it can be a Rust dyn-link library dylib, which makes it easier to integrate with Rust. However, using a C-API has the benefit that a plugin can be written in C or any other language.

Handle implementation

The engine then uses a standard entry-point to discover all the functions/constants inside this plugin. Something that takes a call-back closure from Rust and, with repeated calls, each time returning info of one single function or constant. Return NULL when done.

The Engine then takes this information and then cache them all for fast lookups.

Obviously, for each function, the plugin should return some form of "handle" (which doesn't have to be a hash), maybe simply an offset to an internal array or something. It can even be a simple function pointer. A function call can simply use this "handle" instead of having to pass the function name + parameter types + ABI all over again.

Alternative implementation

The same plugin-loading mechanism as above. But, after loading the module, do nothing.

When a call to a module function is made, the engine calls a C API entry-point on that plugin library, passing it the function name, parameter types, and actual parameters, and the ABI type (e.g. the first parameter may be a reference pointer).

The C API then dispatches the function internally based on its own mechanism that the scripting engine doesn't need to know.

This method is much simpler and encapsulated - you can completely change the internal mechanism of the plugins without affect the engine. Development of plugins and the engine can proceed separately. However, the downside is always having to pass a lot of information on every single module function call, which will eventually impact performance.

I think the handle method is better for performance, and that's how most C-style programs are done. One additional benefit for the handle method is ability to polymorph the object behind the handle. In the alternative method, there is a lot more information exposed and they can only realistically map to a function call.

jhwgh1968 commented 4 years ago

What value is there in extending the engine versus writing a plugin? What APIs should be on which boundaries?

Extend the engine when you know the feature will always be there. It'll be compiled in, inlined and optimized by Rust. The result is one single executable file.

A plugin, if it resides in the same main app program and compiled in, is really no different from an engine extension.

A plugin, if residing in an external file (e.g. a cdylib, .dll or .so shared library), can be hot-loaded and can be discovered during runtime. Now that's a completely different level of capabilities. I believe that's what you're thinking of for your plugins system, right?

I actually wasn't planning on going that far. Rust does not have a stable ABI yet, so I don't think the return-on-investment is there. As you say, would either: (a) have to make sure plugins are compiled with the same version of the Rust compiler, or (b) marshal a lot to C and un-marshal it on the other side.

In retrospect, perhaps "Plugin" was an imprecise word. Many of the extra "plugins" I listed in Python and Perl are either statically linked, or tightly dynamically linked (not lazily loaded on demand) into their "engine". They are separate code because they are not required by the core itself, even though they are part of the language as a whole.

In Perl, for example, the interpreter core is called "miniperl". You can compile it by itself, and you will get a dumb REPL that is less than half the size of full Perl. However, that interpreter is basically unusable for anything most people want to do Perl. Among other limitations, it can't read a script from anywhere but stdin, it can't use a number of "obvious" system calls, and it has an extremely trivial pattern matching engine which cannot handle even most POSIX regexes.

When you perform a normal build of Perl, you build libraries that are statically linked to "miniperl" in order to create the interpreter you actually use. These are all deeply linked it to the core, and the final Perl executable cannot function without them. One example is PCRE, the Perl Compatible Regex library, which has hooks to override the pattern matching in the core. While PCRE is sometimes dynamically linked because it is a library in its own right, it is still eagerly loaded and required.

That is more like what I was envisioning with plugins in Rhai. They are Rust code that "create the environment" for the script, either fleshing it out into a full interpreter with access to the system, or providing APIs suitable for making it useful and extensible when embedded.

The latter case, in particular, is why I have been focused on how easy it is (or isn't) to expose complex types or a suite of functions to Rhai, with thoughts of procedural macros.

jhwgh1968 commented 4 years ago

Perhaps, as you say above "extensions" would be a better term.

I suppose I did not use it, because that makes me think of Firefox -- where extensions prior to FF 52 could be deeply invasive, and break everything without regard to API boundaries.

That is not what I was thinking. The engine would have an API that these extensions could hook into, and limit what they could do to the language as a whole.

schungx commented 4 years ago

What you're describing is similar to what an assembly is in .NET world. It is an essential part of the running app, without it the app won't run. However, they can be compiled separately and then dynamically linked. The ABI is stable, so plugins really are simply the same assemblies, but not required for the app to run.

In Rust, it is more difficult to do what you want because Rust has no reflection API. There is no way to "include" a crate/rlib in a build, and then have the app automatically discover that it is there. The app must already know that it is there, perhaps through some entry in a lookup table, or a configuration file that is read during build to specify the entry-point function name sto call. In PERL I assume the build stage will have macros feeding into the C source that provides linkages to compiled-in extensions.

To do your extensions, it might actually be better to compile them into dynlib's, which are Rust ABI crates that can be dynamically loaded and linked. Then you only need to scan a directory to find all the files to load.

And if you do that, mind as well use standard C ABI. It really is very simple to expose a C-standard ABI from Rust, as well as to consume one. I think this route offers to most flexibility, as you can then easily port code written in C to a extension for Rhai.

jhwgh1968 commented 4 years ago

To try and summarize what happened since I last posted to this thread, I have gotten a basic start on plugins merged into a feature branch.

This vision of plugins focuses on my use case: pieces of Rust code that can modify the engine or expose modules. They are statically linked, defined by a user of Rhai who is embedding Rhai's engine in another project.

Based on the plumbing code that was merged, I will spend some time focusing more on the porcelain: procedural macros similar to what I referenced above.

jhwgh1968 commented 4 years ago

Also, as @schungx has noted, there would be some benefits to folding a number of other features (such as packages) into plugins that are or are not loaded. Since I expect those will look somewhat different than user code, I think that work can be done in parallel, and re-written as I create porcelain pieces that are helpful.

schungx commented 4 years ago

Looking forward to the proc macros!

Actually, the closer it is to user code, the better. That's because users would want the same power of expression when writing extensions to the language as the language authors.

schungx commented 4 years ago

Oh, by the way, @jhwgh1968 , I'll soon push a commit that's transparently convert between &str and Immutable. That's because users will get sick of having to write wrapper closures simply to turn their String parameters into ImmutableString otherwise Rhai won't find them.

So in your proc macros, you might want to take advantage of this. You simply call set_fn with TypeId::of::<ImmutableString>() (if the parameter is &str) and Rhai does the rest. The parameter itself can be kept as &str.

schungx commented 4 years ago

The commit is in, and it is merged into plugins. You should now handle parameters of &str in your macros.

schungx commented 4 years ago

@jhwgh1968 a couple of things for you to consider when you implement your proc macros:

1) Now &str maps to ImmutableString as a convenience, there is a function by_value in fn_register.rs that simply converts from a &mut Dynamic into whatever parameter type required. It consumes the value pointed to by the reference and returns an owned value of the appropriate type. It takes care of &str as well (i.e. it'll convert an ImmutableString into &str). You can simply use this function to downcast function parameters from Dynamic and it takes care of &str automatically.

2) Rhai supports two calling ABI's: all-owned parameters, and first &mut reference (and the rest owned values). However, for functions generated by your proc macros, you are not under such restriction. In fact, you're free to use any parameter type (including & and &mut) in any parameter position, as long as your proc macro does the correct conversion from Dynamic. This will make plugins extremely versatile, much more flexible than register_fn. However, notice that all parameters other than the first one are cloned anyway, so you don't really save much by using a reference instead of consuming the value. In the future, if the Engine architecture changes, maybe there will be a benefit.

schungx commented 4 years ago

@jhwgh1968 just a tip up. To the CallableFunction variant I added a parameter, a reference to the Engine. This is to enable writing functions that rely on some Engine setting such as checking the maximum allowed size of an array.

It should not affect your PluginFunction variant. However, things have changed a bit, so I suggest changing it to this function call signature:

fn call(&self, engine: &Engine, args: &[&mut Dynamic]) -> Result<Dynamic, Box<EvalAltResult>>;

First of all, the pos parameter is dropped. Use Position::none() for the EvalAltResult. The error path is so rarely used that we don't really want to pass around a position all the time just for it to be thrown away. The error position is set outside the function call.

Second, add an engine reference parameter to the running Engine. This allows your plugin to query the Engine for certain runtime settings (e.g. data size limits).

jhwgh1968 commented 4 years ago

Just so you know, I have been seeing these comments. I've been pretty busy, but I will make sure they are integrated before I do my next PR.

schungx commented 4 years ago

Great! Take your time! No pressure!

jhwgh1968 commented 4 years ago

After making some progress, I was preparing to open an early PR. However, I just tried to rebase onto the plugins branch, and according to the diff, some of the recent merges from master have completely undone all my prior work.

Could you fix that, please?

schungx commented 4 years ago

completely undone all my prior work.

Huh? I thought I had been careful about this...

Can you let me know what is missing?

Sorry!

I remember one time that, after catching up the plugins branch, I forgot to switch back to master and ended up doing a bunch of work on the plugins branch. I probably should have done a "rebase" or some sort of mysterious Git command to move it back to master, but of course I didn't, and I now forgot what I did...

I think I manually moved the changes back to master, perhaps wiping out something on the way...

schungx commented 4 years ago

OK I can see it now. I'll fix.

Ended up I wiped out almost all the changes...

schungx commented 4 years ago

@jhwgh1968 it should now be fixed.

jhwgh1968 commented 4 years ago

Thanks, @schungx! I'll have a PR open within the next day or two, looking for some early feedback.

jhwgh1968 commented 4 years ago

Update

Version 0.1 of the procedural macros crate is now merged to the plugins branch! :tada:

Current To-Do List

To-Do list is now in this comment

Code Examples

Here are the current top-level code examples from the crate to give an idea of the syntax:

Exporting a Module to Rhai

use rhai::{EvalAltResult, FLOAT};
use rhai::plugin::*; // needed for macro hygiene
use rhai::module_resolvers::*;

#[rhai::export_module]
pub mod advanced_math {
    use rhai::FLOAT;

    pub const MYSTIC_NUMBER: FLOAT = 42.0 as FLOAT;

    pub fn euclidean_distance(x1: FLOAT, y1: FLOAT, x2: FLOAT, y2: FLOAT) -> FLOAT {
        ((y2 - y1).abs().powf(2.0) + (x2 -x1).abs().powf(2.0)).sqrt()
    }
}

fn main() -> Result<(), Box<EvalAltResult>> {
    let mut engine = Engine::new();
    let m = rhai::exported_module!(advanced_math);
    let mut r = StaticModuleResolver::new();
    r.insert("Math::Advanced".to_string(), m);
    engine.set_module_resolver(Some(r));

    assert_eq!(engine.eval::<FLOAT>(
        r#"import "Math::Advanced" as math;
           let m = math::MYSTIC_NUMBER;
           let x = math::euclidean_distance(0.0, 1.0, 0.0, m);
           x"#)?, 41.0);
    Ok(())
}

Exporting a Function to Rhai

use rhai::{EvalAltResult, FLOAT, Module, RegisterFn};
use rhai::plugin::*; // needed for macro hygiene
use rhai::module_resolvers::*;

#[rhai::export_fn]
pub fn distance_function(x1: FLOAT, y1: FLOAT, x2: FLOAT, y2: FLOAT) -> FLOAT {
    ((y2 - y1).abs().powf(2.0) + (x2 -x1).abs().powf(2.0)).sqrt()
}

fn main() -> Result<(), Box<EvalAltResult>> {

    let mut engine = Engine::new();
    engine.register_fn("get_mystic_number", || { 42 as FLOAT });
    let mut m = Module::new();
    rhai::register_exported_fn!(m, "euclidean_distance", distance_function);
    let mut r = StaticModuleResolver::new();
    r.insert("Math::Advanced".to_string(), m);
    engine.set_module_resolver(Some(r));

    assert_eq!(engine.eval::<FLOAT>(
        r#"import "Math::Advanced" as math;
           let m = get_mystic_number();
           let x = math::euclidean_distance(0.0, 1.0, 0.0, m);
           x"#)?, 41.0);
    Ok(())
}
schungx commented 4 years ago

@jhwgh1968 There are some breaking changes. Version 0.18.1 has merged in real closures, so the Dynamic API is changed a bit.

downcast_ref ==> read_lock downcast_mut ==> write_lock

Just replace these calls and it should work...

I've merged in the latest master into plugins. It doesn't build because of the macros referring to downcast_ref which has been made private.