rhaiscript / rhai

Rhai - An embedded scripting language for Rust.
https://crates.io/crates/rhai
Apache License 2.0
3.8k stars 177 forks source link

Is there any interpolation function supporting byte type? #522

Closed cn-kali-team closed 2 years ago

cn-kali-team commented 2 years ago

- Or whether there is illegal UTF-8 conversion of byte data into string without loss, that is, it is better to use string to represent byte data in script.
schungx commented 2 years ago

You need to realize that interpolation is nothing but syntactic sugar for the + operator.

Therefore, the following are equivalent:

let s = `hi ${k}`;    // <- this

let s = "hi " + k;    // <- is equivalent to this

So for your bytes, you can easily do:

let body = "ssssssssssssssssssssss".as_bytes();

let body = "aaa ".as_bytes() + body;

// or...
let body = "aaa ".as_bytes();
body += "ssssssssssssssssssssss".as_bytes();
schungx commented 2 years ago

I'll add a to_blob method to strings in the next version so you don't have to define your own as_bytes.

schungx commented 2 years ago

However, after reading your post more carefully, it seems that you want to use strings interpolation but with BLOB's that are UTF-8 data. You may want built-in support for building strings out of UTF-8 bytes other than sub-strings.

If this is what you want, then there is no such built-in support right now. It would not be difficult to add though, but the trick is to do it yourself by defining the + operator for the operand types string and Blob. Something like:

engine.register_fn("+", |s: ImmutableString, b: Blob| {
    if b.is_empty() {
        s
    } else if s.is_empty() {
        String::from_utf8_lossy(&b).into_owned().into()
    } else {
        let mut s = s.to_string();
        let b = String::from_utf8_lossy(&b);
        s.push_str(b.as_ref());
        s
    }
});
cn-kali-team commented 2 years ago

However, illegal strings will be lost. I just want to convert a string to bytes, not bytes to strings, because converting from bytes to strings will lose data.

cn-kali-team commented 2 years ago

If it is the previous format, I can easily convert the string of blob to byte type without loss, but the current compact format is changeable, and I have to write more code to convert it back to byte type.

let body = "ssssssssssssssssssssss".to_blob();
print(body);
// [115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115]
// [7373737373737373 7373737373737373 737373737373]

How to hook byte type to_ string() method. I want to restore the previous format when converting bytes to strings, not compact. In this way, I can better separate the connection between string and byte.

schungx commented 2 years ago

Not very sure what you want.

Which print format do you want?

Or you may give a more concrete example.

cn-kali-team commented 2 years ago

like this

// [115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115]

This is the previous format, but now it is changed to compact. I wonder if there is a way to get the previous format by hook without modifying the source code and using patch.

let body = "ssssssssssssssssssssss".to_blob();
print(body);
// [115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115]
let body = `aaa ${body}`;
print(body);
// aaa [115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115]
schungx commented 2 years ago

Ah, OK. Actually, I've changed the source code for special treatment for BLOB's, so in the future

let body = `aaa ${body}`;
print(body);
// aaa sssssssssssssssssssssssss

In order to get the previous compact version, you can do:

let body = `aaa ${body.to_string()}`;
print(body);
// aaa [7373737373737373 7373737373737373 737373737373]

For the non-compact version, you need to change the BLOB into an array of integers. For example:

let x = [];
for ch in body { x += ch; }
let body = `aaa ${x}`;
print(body);
// aaa [115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115]

In the future, you can do:

let body = `aaa ${body.to_array()}`;
print(body);
// aaa [115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115]
cn-kali-team commented 2 years ago

In the future, you can do:

let body = `aaa ${body.to_array()}`;
print(body);
// aaa [115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115]

In the interpolation expression, I don't know whether the body is of byte type, so I can't judge whether it needs to be used to_ array() method. Can I add engine.set_blob_format(), allowing developers to choose the format to return.

let mut engine = Engine::new();
engine.disable_symbol("eval");
engine.set_blob_format("array");
  • and
    let body = `aaa ${body}`;
    print(body);
    // aaa [115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115]
cn-kali-team commented 2 years ago

I've done it

#[export_fn]
pub fn blob_to_array_string(bytes: Blob) -> String {
    format!("{:?}", bytes)
}
#[export_fn]
pub fn string_to_bytes(string: &str) -> rhai::Blob {
    string.as_bytes().to_vec()
}

engine.register_fn("to_string", blob_to_array_string);
engine.register_fn("to_blob", string_to_bytes);
let body = "ssssssssssssssssssssss".to_blob();
let body = `aaa${body}`;
print(body);
aaa[115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115, 115]
schungx commented 2 years ago

Yep, that's the way to do it. You can just "monkey patch" Rhai's built-in functions to suit your own. No need for new options.