mlua-rs / rlua

High level Lua bindings to Rust
Other
1.72k stars 115 forks source link

Segfault on clone of #[repr(C)] userdata subfield #253

Closed makspll closed 2 years ago

makspll commented 2 years ago

Hello, it's me again,

I think I've found a serious bug, and also managed to isolate it to a small example:

use rlua::{UserData,MetaMethod,Function,prelude::*};
use bevy::prelude::Quat;

#[derive(Debug,Clone)]
pub struct MyUserData {
    quat: Quat
}

impl UserData for MyUserData {
    fn add_methods<'lua, T: LuaUserDataMethods<'lua, Self>>(methods: &mut T) {
        methods.add_meta_method(MetaMethod::ToString, |_,s,()| {
            Ok(format!("{:?}",s))
        });

        methods.add_meta_method_mut(MetaMethod::NewIndex, |_,s,(_,val) : (String,MyUserData)|{
            s.quat = val.quat.clone();
            Ok(())
        })
    }
}

fn main() {
    let lua = Lua::new();
    lua.context(|lua_ctx| {
        let g = lua_ctx.globals();
        let f = lua_ctx.create_function(|_,(x,y,z,w):(f32,f32,f32,f32)| 
                Ok(MyUserData{quat:Quat::from_xyzw(x, y, z, w)
            })).unwrap();

        g.set("quat",f).unwrap();

        lua_ctx
            .load("
            function on_update(my_user_data)
                print(my_user_data)
                my_user_data.quat = quat(4,3,2,1)
                print(my_user_data)
            end
            ")
            .exec()
    }).unwrap();

    lua.context(|lua_ctx| {
        let g = lua_ctx.globals();
        let f : Function = g.get("on_update").unwrap();

        f.call::<MyUserData,()>(MyUserData{
            quat : Quat::from_xyzw(1.0,2.0,3.0,4.0),
        }).unwrap();
    })

}

Now this code relies on: rlua and glam (through a bevy re-export, Quat), I found this while removing unsafe from my code completely, and initially thought it was on my side. After realising there was 0 unsafe code and this segfault was still happening i tried isolating it and yep here it is.

Interestingly this happens identically for the Vec4 from glam as well. I think this might something to do with alignment and size of these structures. The segfault happens specifically in the clone call due to a null access. Interestingly though, if you print everything before the clone, all values look as normal without issue. @promethia-27 looked through the assembly/memory at that point and everything looks sane.

PROMETHIA-27 commented 2 years ago

I will note I'm not an expert in that area- I just couldn't find anything that looked wrong.

jugglerchris commented 2 years ago

Thanks for the report, and especially for the small example! I can reproduce it here. The fact that there's no unsafe in the example means the bug must be either in rlua or Quat - I'll look into it.

jugglerchris commented 2 years ago

Interestingly this happens identically for the Vec4 from glam as well. I think this might something to do with alignment and size of these structures.

This was spot on - it turns out that with the default configuration (LUAI_MAXALIGN) Lua only guarantees 8 byte alignment for userdata. The crash (at least on my x86-64 system) is because Quat::clone() is using the movaps instruction which assumes 16-byte alignment but the Quat is only 8 byte aligned.

I could try to increase LUAI_MAXALIGN, but that would only push the potential problem to the next size up - I think the correct fix will be to compensate in the Lua code where the correct alignment is known.

makspll commented 2 years ago

That's very interesting! I noticed there is a possibility to use a custom allocation function, would that be of any use here ?

Also while I have your attention, since the UserData's are stored directly in lua, is it safe to keep and use pointers to fields of UserData managed on the Lua side (say a UserData like MyUserData in this case, but assuming it's alive at that point)

jugglerchris commented 2 years ago

rlua already uses an allocation function (to track allocation totals), but this doesn't help as userdata is stored internally to Lua with a header which messes up the alignment in this case.

As far as pointers to fields of userdata, it would be difficult to avoid Rust undefined behaviour by e.g. accidentally having an &mut reference at the same time as another reference, for example when calling a method. I think the answer would be the same as if you wanted to keep references to something a different Rust library kept in a Box<T> which it kept ownership of.

However you can implement UserData on Arc<T>, and then you can legally keep another copy of the Arc outside (at the cost of the reference counting and extra indirection of course).

devzf commented 2 years ago

@makspll you should try mlua instead, it works fine on your example, no segfault.

jugglerchris commented 2 years ago

Hi, I believe #254 fixes this issue. I will merge it and kick off a release in the next few days if I don't hear otherwise. Thanks again for the report!

jugglerchris commented 2 years ago

I've released rlua 0.19.2 with this fix.

makspll commented 2 years ago

Sorry for late response, This seems to fix my issue! Fantastic work!