rhaiscript / rhai

Rhai - An embedded scripting language for Rust.
https://crates.io/crates/rhai
Apache License 2.0
3.73k stars 175 forks source link

Custom operators and custom syntax #166

Closed zach-schoenberger closed 4 years ago

zach-schoenberger commented 4 years ago

Hi, I am looking at using this library as a replacement for a custom SPEL setup i'm using in java. In order to do so I need to define custom operators. One specific case is '"str1" contains "str"', where a contains operator is defined. This isn't the exact use case but it is very similar. I have tried using "engine.register_fn" but could not get it to work as needed. I have an extremely large number of expressions already defined and it is not reasonable to update to the function syntax. Is there a way to implement a custom operator in this way?

Ex Code

fn contains(s1: ImmutableString, s2: ImmutableString) -> bool {
    return s1.contains(s2.as_str());
}

engine.register_fn("contains", contains);

let res = engine.eval::<bool>("\"hello world\".contains(\"world\")").unwrap();
 println!("WORKS: got result: {}", res);

  // returns error ErrorParsing(MissingToken(";", "to terminate this statement"), 1:15) 
let res = engine.eval::<bool>("\"hello world\" contains \"world\" ").unwrap();
println!("got result: {}", res);
schungx commented 4 years ago

One specific case is '"str1" contains "str"', where a contains operator is defined.

Well, currently str in str1 works... Maybe you can pre-process the script to turn contains into in?

Alternatively, you can fork Rhai and then change the keywords in tokens.rs. Change "in" to "contains" and then Bob's your uncle.

https://github.com/jonathandturner/rhai/blob/master/src/token.rs#L776 https://github.com/jonathandturner/rhai/blob/master/src/token.rs#L256

A third method is for me to add custom operators into the feature set itself so any custom operator can be defined. This is not difficult to do, but it can potentially make the language extremely fluid and unrecognizable when everybody defines their own operators.

If you're interested you can fork Rhai and add these operators yourself very easily. I can point you to the relevant source files to change.

zach-schoenberger commented 4 years ago

Thanks for the quick response! I've only started digging through the code to see what it would take to go the "any custom operator" route. The above example is just one of several I would need to do. Just curious, is there a reason why the community would not want that ability? Since it the unrecognizable aspect would only be local to whatever the user has implemented for their system.

Either way I would greatly appreciate any insight as to where to look at for either adding the operators or the custom route if you feel like it.

schungx commented 4 years ago

Well, probably just my personal opinion anyway. We can probably take a poll of the community.

My feeling is: it is one thing to define custom functions and use custom types. The syntax of everything is consistent. A Rhai script looks like a Rhai script.

But operators are part of the syntax as they are recognized by the tokenizer. Allowing customer operators mean extending the Rhai language with unlimited possible keywords. And there is the question of the precedence level of the operator and whether you allow a user to specify how tightly it binds and whether it binds to the right or to the left.

For example:

let x = foo + 1 bar baz * 2;

Assuming bar is a custom operator. Is it:

And that's probably going to make Rhai scripts using custom operators not easily portable. You can't simply copy a Rhai script and reasonably expect it to work.

But of course, this is my own personal opinion...

zach-schoenberger commented 4 years ago

Again, thank you for the response! The precedence issue makes a lot of sense. And I had a feeling it would be an issue as I was going through some of the organization of operator precedence, and it could be something that causes optimization issues. I also had not thought about the portability of a script, mostly I guess because I wouldn't really expect people to share these types of scripts. I can definitely see it as something to be considered.

I do think there is value to adding this type of functionality, seeing as thats the use case i'm looking at :). In this case if a user had created their own script using their own custom defined setup, they likely would not be sharing them with others that did not already have the background knowledge.

Personally I've come across several different embedded scripts in different areas and I've always had no clue what they were for at first. There's so many out there that you can never be sure what type of script it is without some background knowledge.

I'm testing out modifying a fork to allow custom operators like we're discussing. It looks like it could be done with the combination of module.set_fn_2 and updating the procedural macro in register_fn. If that sounds wrong please let me know!

schungx commented 4 years ago

I'm testing out modifying a fork to allow custom operators like we're discussing. It looks like it could be done with the combination of module.set_fn_2 and updating the procedural macro in register_fn. If that sounds wrong please let me know!

I wouldn't suggest touching these macros as you don't need to. All operators in Rhai are implemented as function calls anyway, so the mechanism is already there.

I would say you need to first have some form of an operators registry, perhaps a HashMap that stores special operator keywords. Otherwise the parser will choke on your custom keyword.

Then modify the tokenizer in tokens.rs where an identifier is parsed. Look up the operators registry to make sure it is not a custom operator. If it is, then return a new token variant, which you probably needs to define. Something like CustomOperator(String).

When you define the new token variant, you'll need to add it to the standard functions otherwise Rust will complain. Such as mapping the token's syntax which can just return the wrapped string.

Then you'll need to make sure the operator has the right precedence. Look into Token::precedence(). Or maybe you'll allow the user to give a precedence also.

Then in the main function parsing binary expressions parse_binary_op in parser.rs, you'd want to handle the case of this CustomOperator. Likely you'd convert it into a function call, just like other operators.

That function will need to be registered in the normal way. I'd suggest using a special function name, such as operator$contains.

You may also want to handle error messages when an operator function of the correct type doesn't exist.

Test...

Then profit!

schungx commented 4 years ago

@zach-schoenberger I have been thinking about this some more.

We can mind as well add the ability to arbitrarily extend Rhai's syntax. Just register a stream of tokens with the Engine, and the engine parses it as a custom statement/expression type.

For example, to add a do-while loop syntax to Rhai:

engine.register_custom_syntax(&[ "do", "$block$", "while", "$expr$" ],
    | engine: &Engine, scope: &mut Scope, state: &mut State, parts: &[ &Expr ] | {
        // implement the custom syntax
        while engine.eval_expr(scope, state, parts[3]).try_cast::<bool>()? {
            engine.eval_block(scope, state, parts[1])?;
        }
        Ok(());
    }
);

Then the script can do:

let x = 0;

do {
    x += 1
} while x < 42;

I can think of many uses for this in DSL scenarios... what do you think?

zach-schoenberger commented 4 years ago

Sorry work got really busy and hadn't had a chance to circle back to this. That's a really cool idea! I think what I was trying to do would, at its root, fall under this. This does so much more though. And this would be extremely useful in DSL's.

schungx commented 4 years ago

However, since Rhai has a top-down parser, it requires a unique keyword starting off any custom statement.

Therefore you cannot really do: let x = hey my_op you; you need to do: let x = do hey my_op you; by introducing a new keyword do.

Or we can combine custom syntax extensions together with custom operators...

Let me think about it a bit more.

schungx commented 4 years ago

Although it is probably not difficult to add (I can't see any real difficulties), I am still struggling whether there is something achievable with this that cannot be done simply by a function.

This really is nothing more than the ability to define a novel syntax and that syntax can also be handled by writing a simple pre-processor that converts that syntax to function calls.

zach-schoenberger commented 4 years ago

You're not wrong that it couldn't be addressed via pre-processing. It just depends on how much control you want to give users of rhai in terms of DSL. IMO the more ability you give the user to customize their scripts DSL the more likely a person is to use it. It's definitely a niche nicety that would be ignored by the general user, but for some is a game changer.

schungx commented 4 years ago

I'll spend some time to give it a shot and do a version. Let's see if it is really useful...

schungx commented 4 years ago

In the process of adding custom syntax, I've completed custom operators.

Try it out from my fork: https://github.com/schungx/rhai

The API is Engine::register_custom_operator("contains", 70).

The 70 is the operator's precedence. 70 makes it the same as the in operator.

zach-schoenberger commented 4 years ago

Awesome! I will try it out as soon as i can.

schungx commented 4 years ago

Beware that I changed the precedence table to spread out the values a bit more.

So now in's precedence is 130.

zach-schoenberger commented 4 years ago

Yep. I really appreciate the in depth readme on the topic. It makes it clear how to use this feature.

schungx commented 4 years ago

Is it working for you?

zach-schoenberger commented 4 years ago

Sorry just got to test it now. Works great!

zach-schoenberger commented 4 years ago

Here is the dumbed down main use case I was looking for and it works great:

use rhai::{packages::*, Engine, EvalAltResult, RegisterFn, Scope, INT};

fn main() -> Result<(), Box<EvalAltResult>> {
    let mut engine = Engine::new();
    engine.register_custom_operator("contains", 110);
    engine.register_fn("contains", contains);

    let str_vec = vec!["world".to_string(), "hello".to_string()];
    let mut scope = Scope::new();
    scope.push("StrVec", str_vec);

    let result =
        engine.eval_expression_with_scope::<bool>(&mut scope, "StrVec contains \"hello\"")?;

    println!("Answer: {}", result); // prints true

    Ok(())
}

fn contains(str_vec: Vec<String>, value: &str) -> bool {
    for s in str_vec.iter() {
        if s.contains(value) {
            return true;
        }
    }
    return false;
}
schungx commented 4 years ago

Pass the first parameter via reference to avoid copying:

fn contains(str_vec: &mut Vec<String>, value: &str) -> bool {
    for s in str_vec.iter() {
        if s.contains(value) {
            return true;
        }
    }
    return false;
}
schungx commented 4 years ago

https://github.com/jonathandturner/rhai/pull/176 now has custom syntax. You can create custom expression types.

Look into tests/syntax.rs for an example as I haven't finished writing up the documentation yet.

zach-schoenberger commented 4 years ago

Nice. I'll check it out today.

schungx commented 4 years ago

There is now a writeup of custom syntax in the Book here: https://schungx.github.io/rhai/vnext/engine/custom-syntax.html

schungx commented 4 years ago

Version 0.17.0 officially has custom operators and custom syntax. Closing this.