tomaka / hlua

Rust library to interface with Lua
MIT License
507 stars 48 forks source link

Passing non-UTF-8 strings between Lua and Rust is impossible #156

Closed fstirlitz closed 6 years ago

fstirlitz commented 7 years ago

Lua strings are bytestrings, and the language doesn't prescribe any particular encoding for them (Lua 5.3 adds \u{} escapes which expand to UTF-8 code units, but this is a purely syntactic convenience). This binding, however, expects all strings to be UTF-8 and doesn't allow passing or receiving arbitrary byte sequences between Rust code and Lua code.

Furthermore, this imposition of arbitrary semantics was done in a particularly sloppy way: the code below will panic instead of just returning an Err(_) value (which happens for e.g. mismatched types).

extern crate hlua;

fn main() {
    let mut lua = hlua::Lua::new();
    let result = lua.execute::<String>("return '\\255'");
    println!("{:?}", result);
}
tomaka commented 7 years ago

Lua strings are bytestrings, and the language doesn't prescribe any particular encoding for them (Lua 5.3 adds \u{} escapes which expand to UTF-8 code units, but this is a purely syntactic convenience). This binding, however, expects all strings to be UTF-8 and doesn't allow passing or receiving arbitrary byte sequences between Rust code and Lua code.

There's a AnyLuaString struct that you can read if you want a non-UTF8 String: https://github.com/tomaka/hlua/blob/f889c8af6e7f51279de9e80dc7a60a7114c8cf79/hlua/src/any.rs#L13

Furthermore, this imposition of arbitrary semantics was done in a particularly sloppy way: the code below will panic instead of just returning an Err(_) value (which happens for e.g. mismatched types).

That's a bug.

fstirlitz commented 7 years ago

Actually that one's fixed too...

https://github.com/tomaka/hlua/commit/81fcd670c9547d6bcf544233e9bf6e59d4a2388f#diff-060c31e9b617d3cced77e1130e8c51d8R156

Well, my bad. In my defence, it wasn't released yet.

On the other hand, the code still cuts the string at any embedded zero byte. Should have used slice::from_raw_parts instead of CStr. And I don't see the Push trait implemented for &[u8] or Vec<u8>, like it is for &str and String.

tomaka commented 7 years ago

And I don't see the Push trait implemented for &[u8] or Vec, like it is for &str and String.

That would push an array of integers and not a string. The reason why AnyLuaString exists is to not have this ambiguity.