mlua-rs / rlua

High level Lua bindings to Rust
Other
1.72k stars 115 forks source link

Potential HUGE loss in performance #248

Closed Yuri6037 closed 9 months ago

Yuri6037 commented 2 years ago

I'm looking into using this library in a 3D engine project. I'm hitting a HUGE performance wall with this library that the C API would never get: all UserData values must be cloned in order to read them from Lua! This means that every time you map a function which takes this and another instance of itself you must allocate again even if the type is already allocated and managed by Lua.

It should be possible to accept references to user data types instead of cloning all the time.

jugglerchris commented 2 years ago

Hi, Can you give a small code example showing what you're doing?

Yuri6037 commented 2 years ago

I've since found a different way not using userdata which allows for more flexibility from Lua so I no longer have the code.

The code was however very similar to this one:

impl UserData for MyObject {
    fn add_methods<'lua, T: UserDataMethods<'lua, Self>>(methods: &mut T) {
        methods.add_method("somemethod", |_, this, other: &MyObject| {
            //...some code here...
           Ok(())
        });
    }
}

It's pretty clear the above code won't build without MyObject to be Clone and removing the reference but here comes the problem: what if MyObject is big, like say a texture memory wrapper? Consider, for example, a 4K RGBA 32bpp texture may be 4,096x4,096x4=67,108,864 bytes which is HUGE. Not only will this allocate MUCH more RAM but this will also increase CPU time required to do the deep copy which here is Theta(n) with n the number of bytes...

I've looked at the implementation of UserData FromLua and I think it is possible to implement it for std::cell::Ref<'lua, T> and std::cell::RefMut<'lua, T> which would be a LOT better than the current Clone at the possible expense that if Lua calls obj:somemethod(obj) it might fail with borrow errors.

jugglerchris commented 2 years ago

Ok, I see what you mean that you can't directly take a reference to a userdata, because FromLua is only implemented by value for concrete UserData types. You can do it by using AnyUserData, though:

    // Note: not Clone
    struct MyUserData(i64);

      impl UserData for MyUserData {
          fn add_methods<'lua, M: UserDataMethods<'lua, Self>>(methods: &mut M) {
              methods.add_method("addinner", |_, data, other: AnyUserData| {
                  let other = other.borrow::<MyUserData>().unwrap();
                  Ok(data.0 + other.0)
              });
          }
      }

      Lua::new().context(|lua| {
          let globals = lua.globals();
          globals.set("userdata1", MyUserData(7)).unwrap();
          globals.set("userdata2", MyUserData(3)).unwrap();
          assert_eq!(
              lua.load("userdata1:addinner(userdata2)")
                  .eval::<u64>()
                  .unwrap(),
              10
          );
      });
Yuri6037 commented 2 years ago

Ok, I see what you mean that you can't directly take a reference to a userdata, because FromLua is only implemented by value for concrete UserData types. You can do it by using AnyUserData, though:

    // Note: not Clone
    struct MyUserData(i64);

      impl UserData for MyUserData {
          fn add_methods<'lua, M: UserDataMethods<'lua, Self>>(methods: &mut M) {
              methods.add_method("addinner", |_, data, other: AnyUserData| {
                  let other = other.borrow::<MyUserData>().unwrap();
                  Ok(data.0 + other.0)
              });
          }
      }

      Lua::new().context(|lua| {
          let globals = lua.globals();
          globals.set("userdata1", MyUserData(7)).unwrap();
          globals.set("userdata2", MyUserData(3)).unwrap();
          assert_eq!(
              lua.load("userdata1:addinner(userdata2)")
                  .eval::<u64>()
                  .unwrap(),
              10
          );
      });

I see but that is not really convenient in order to handle errors properly without unwrap. I think taking references should be common enough to justify some wrapper with FromLua or even std::cell::Ref directly.

jugglerchris commented 2 years ago

I've looked at the implementation of UserData FromLua and I think it is possible to implement it for std::cell::Ref<'lua, T> and std::cell::RefMut<'lua, T> which would be a LOT better than the current Clone at the possible expense that if Lua calls obj:somemethod(obj) it might fail with borrow errors.

I agree, I think that would make it more convenient, at the cost of borrow errors (though only when using mut methods and/or RefMut). I don't think it's significantly worse than the errors you can get if FromLua fails for other types, e.g. if you pass an object of the wrong UserData.

Yuri6037 commented 2 years ago

I've looked at the implementation of UserData FromLua and I think it is possible to implement it for std::cell::Ref<'lua, T> and std::cell::RefMut<'lua, T> which would be a LOT better than the current Clone at the possible expense that if Lua calls obj:somemethod(obj) it might fail with borrow errors.

I agree, I think that would make it more convenient, at the cost of borrow errors (though only when using mut methods and/or RefMut). I don't think it's significantly worse than the errors you can get if FromLua fails for other types, e.g. if you pass an object of the wrong UserData.

That's exactly what I thought just wanted to point out about the possible borrow errors. Would you accept PRs to implement for std::cell::Ref and std::cell::RefMut?

That particular API might come in handy for one use of rlua as a texture preprocessor.

jugglerchris commented 2 years ago

Yes, I'd accept PRs to add those.

Yuri6037 commented 2 years ago

Well I might have missed something: it appears the current design of FromLua does not permit to return a Ref, because the AnyUserData owner would be dropped. Only design I can think without re-writing FromLua is to create a special pointer like wrapper which stores a raw pointer to the actual user data from the Lua stack (*RefCell<T>) then dereference + borrow it when needed. Unfortunately that means the use of an additional borrow function which could return errors but could only return borrow errors.

EDIT2: I may have a solution which still involves a bit of unsafe code but would allow checking about borrow first. The idea is as follows:

jugglerchris commented 2 years ago

I don't think you can assume the same userdata won't be borrowed again - there may be multiple references to it in the same Rust callback (foo(obj, obj, obj), or accessing global variables, etc.), and one of them may get a mutable reference. I'm pretty sure that accessing the userdata without an active Ref<> or RefMut<> would be undefined behaviour.

jugglerchris commented 2 years ago

I've looked at it some more, and I'm convinced that to get a reference to the T: UserData value in a callback, you need both a live AnyUserData<'lua> and a live Ref<T> derived from the AnyUserData, which means doing what I suggested at first (though obviously the .unwrap() was just for example - ? would be better, or more specific handling).

What you can do to make it more convenient is have a wrapper function which does the AnyUserData/borrow dance and calls the real function with the reference - which is what add_method does to give you the reference to the main object directly (see StaticUserDataMethods::box_method, which might not be too bad if you don't have a lot of method "shapes".

Yuri6037 commented 2 years ago

I don't think you can assume the same userdata won't be borrowed again - there may be multiple references to it in the same Rust callback (foo(obj, obj, obj), or accessing global variables, etc.), and one of them may get a mutable reference. I'm pretty sure that accessing the userdata without an active Ref<> or RefMut<> would be undefined behaviour.

Ah yeah I forgot the case with multiple references, well then that remains 1 option: re-implement RefCell to allow changing the internal state of the RefCell while keeping the data as raw pointers instead of Ref guards. That case is like making a private mini RefCell from UnsafeCell that could be done by replacing RefCell with UnsafeCell in the underlying userdata.

jugglerchris commented 2 years ago

I'm not sure I understand what you're proposing - the RefCell is there for a reason, and it's hard to imagine how you could come up with something with the same safety guarantees without having something like the same Ref types. Is it really that bad calling AnyUserData::borrow()?

Yuri6037 commented 2 years ago

I just thought of it again and I think we can provide a compromise: a wrapper type that allows adding type information to AnyUserData and allowing an API of the form myTypedUserData.borrow()? or myTypedUserData.borrow_mut()?. Such wrapper could be named UserData<T> or LazyUserData<T> (to emphasize the fact that the wrapper would not resolve the Ref or RefMut on creation). The idea is to derive FromLua and ToLua on a T-generic wrapper struct containing an AnyUserData and a PhantomData to provide later an easy borrow and borrow_mut avoiding the ugly explicit generic instantiation syntax. The two functions borrow and borrow_mut would be implemented to return respectively Result<Ref<T>> and Result<RefMut<T>>.

This solution should not involve any unsafe code.

Do you think it is reasonable to implement this solution in rlua?

jugglerchris commented 2 years ago

Hi, So do you mean that instead of:

methods.add_function("foo", |_, ud: UserData| {
    let foo = ud.borrow::<Foo>()?.blah();
    ...
}

it would be:

methods.add_function("foo", |_, ud: BorrowedUserData<Foo>| {
    let foo = ud.borrow()?.blah();
    ...
}

So the types would be checked before entering the callback, so the borrow is doing a RefMut borrow without needing an extra type check. I guess it could probably be implemented outside of rlua (it's a new type which could implement FromLua by wrapping AnyUserData), though could be slightly more efficient (though use some unsafe) within rlua. I'm not convinced it's much more convenient than the borrow::<T> turbofish, but I could be convinced that moving the userdata type checking earlier is useful. :-)

Yuri6037 commented 2 years ago

that moving the userdata type checking earlier is useful. :-)

That's exactly the point of this new wrapper type.

I guess it could probably be implemented outside of rlua.

I know that already, what I'm interested in is if this is reasonable to integrate in rlua.

jugglerchris commented 2 years ago

Yes, if it works well I'd be happy to add it to rlua.

khvzak commented 9 months ago

rlua is going to be deprecated soon and this issue is solved in mlua, see UserDataRef type.