lvgl / lv_binding_rust

LVGL bindings for Rust. A powerful and easy-to-use embedded GUI with many widgets, advanced visual effects (opacity, antialiasing, animations) and low memory requirements (16K RAM, 64K Flash).
MIT License
687 stars 71 forks source link

Dropping `Screen`s and associated memory leaks #111

Open nia-e opened 1 year ago

nia-e commented 1 year ago

Currently we provide no way to delete a Screen after it's been instantiated. Having the user manually call a deletion function also feels... un-rusty and generally unwise as it would be easy to forget. Also, making the screen be deleted on Drop carries the risk of e.g. a user calling get_scr_act() multiple times and double-deleting a screen -> a double free -> undefined behaviour. I'm honestly kind of stumped as to what to do here; we can't store some kind of is_active: bool field either as every call to get_scr_act() returns a different Screen and there's no way to dynamically check on drop if there are other Screens pointing to the same underlying object.

The other option is to have the Display store a copy of a pointer to the last Screen it returned and simply refuse to return multiple identical Screens thus allowing us to delete on drop, but this does make things slightly more inconvenient (and can be trivially circumvented by changing the Screen and then going back, unless we store a collection).

Also, how do we prevent accessing child objects of a Screen (or any object) and thus a use-after-free post-deletion?

nia-e commented 1 year ago

Thinking about it, we can also modify the API so as to require a reference to the Screen on widget/object creation (and not get just the active screen) and then store a PhantomData<&'a T> where T: NativeObject such that they will automatically be invalidated on drop.

cydergoth commented 1 year ago

This seems to be the case for any Widget created, unfortunately my understanding of Rust FFI isn't yet good enough to help with this :-(

cydergoth commented 1 year ago

Some thoughts on lifecycle management for LVGL resources in rust

The problem

LVGL is a C language libray which uses either the global malloc/free or a custom allocator. When an LVGL resource is created memory is allocated on the heap and a constant mutable pointer is returned (ignoring static styles for the purposes of this discussion).

Some LVGL resources hold pointers to other LVGL resource, e.g. a widget to the associated set of styles and containers to their children. In this case, the referenced memory must have a lifecycle equal to or greater than the containing resource.

When binding to LVGL as a rust FFI library, then the binding needs to provide appropriate support for the rust user to manage these resources. This includes creation, lifecycle and then destruction.

Errors to mitigate include:

Usage patterns for resources in LVGL

LVGL has a number of different ways in which resources are used.

Global lifespan

The most common usage pattern in an embedded systems context is to allocate LVGL displays, callbacks, screens, styles and widgets early in the startup sequence of the embedded application and never release them. This is sufficient for many embedded system UI requirements. In this scenario, styles may be statically allocated in read-only memory to save ram. For resources which require dynamic allocation, the lower level lvgl_sys:: API may be used.

Where this patten is usable, it does not require free-ing of certain LVGL resources. The common exception is animations. Pagination, tab or carousel type requirements may be satisfied by pre-allocated Fragments and the FragmentManager.

Dynamic lifespan

In this usage pattern, which may be required either to low ram necessitating more fine grained management of ram resource, or in scenarios where the UI layout and components may not be known at firmware build time, then LVGL resources including screens may be dynamically allocated and deallocated during the runtime of the application.

Mixed lifespan

This mode is a combination of the Global and Dynamic lifespans in which some resources exist for the entire lifetime of the application and some are created and destroyed dynamically.

Resource ownership

In LVGL, a widget is (with one exception) always parented by another Widget. Parentage is a one-to-many relationship, and when the parent widget is deallocated then the children are also deallocated in a cascade delete pattern.

For styles, there is a many-to-many relationship as a Widget may reference many styles for different scenarios, and a style may be referenced by many widgets. Fonts also have a many-to-many relationship, but are usually stored in read-only memory.

Images may be either dynamically allocated or stored in ROM.

Thread safety

AFAIK, LVGL is not thread safe. All mutation of LVGL resources should be done in critical regions which lock the entire LVGL domain.

Potential approaches

When binding an FFI resource to a rust library, it is common to wrap each external pointer in a corresponding rust struct. In the lv_binding_rust current implementation, this is done via the bindings rust library and via a code generator which uses proc macros to generate the rust structs and trait impls.

When a rust struct is associated with an FFI pointer, then the lifecycle of the rust struct must be tied to the lifecycle of the FFI pointer. Usually that means that the rust struct exposes a ::new() function which calls the FFI library to create the pointer, and a Drop::drop() implementation which frees the FFI function. As such, it is the rust struct lifetime which dominates the binding. When the rust struct is initialized so is the FFI pointer and when the lifetime of the rust struct expires due to it leaving scope, then the FFI pointer resource is freed.

Rust programmers should expect to be aware of the lifetime management rules of rust, and the borrow checker enforces many of them.

In the case of LVGL, there are two additional considerations. The concept of LVGL widgets being managed in a single parent multi-child tree, and the many-to-many relationship between widgets and styles.

The current implementation of Obj (the base class for a widget) is:

//! Native LVGL objects
//!
//! Objects are individual elements of a displayed surface, similar to widgets.
//! Specifically, an object can either be a widget or a screen. Screen objects
//! are special in that they do not have a parent object but do still implement
//! `NativeObject`.

use crate::lv_core::style::Style;
use crate::{Align, LvError, LvResult};
use core::fmt::{self, Debug};
use core::ptr;

/// Represents a native LVGL object. Note: This is a _trait_ not a struct! It does not _contain_ the pointer
pub trait NativeObject {
    /// Provide common way to access to the underlying native object pointer.
    fn raw(&self) -> LvResult<ptr::NonNull<lvgl_sys::lv_obj_t>>;
}

/// Generic LVGL object.
///
/// This is the parent object of all widget types. It stores the native LVGL raw pointer.
pub struct Obj {
    // We use a raw pointer here because we do not control this memory address, it is controlled
    // by LVGL's global state.
    raw: *mut lvgl_sys::lv_obj_t,
}

impl Debug for Obj {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        f.debug_struct("NativeObject")
            .field("raw", &"!! LVGL lv_obj_t ptr !!")
            .finish()
    }
}

// We need to manually impl methods on Obj since widget codegen is defined in
// terms of Obj
impl Obj {
    pub fn create(parent: &mut impl NativeObject) -> LvResult<Self> {
        unsafe {
            let ptr = lvgl_sys::lv_obj_create(parent.raw()?.as_mut());
            if ptr::NonNull::new(ptr).is_some() {
                //(*ptr).user_data = Box::new(UserDataObj::empty()).into_raw() as *mut _;
                Ok(Self { raw: ptr })
            } else {
                Err(LvError::InvalidReference)
            }
        }
    }

    pub fn new() -> crate::LvResult<Self> {
        let mut parent = crate::display::DefaultDisplay::get_scr_act()?;
        Self::create(&mut parent)
    }
}

impl NativeObject for Obj {
    fn raw(&self) -> LvResult<ptr::NonNull<lvgl_sys::lv_obj_t>> {
        if let Some(non_null_ptr) = ptr::NonNull::new(self.raw) {
            Ok(non_null_ptr)
        } else {
            Err(LvError::InvalidReference)
        }
    }
}

Note that:

In order to address the issues noted above, it is proposed that an explict destructor be implemented for LVGL resources. Implementing such a destructor requires some careful consideration of the issues of parental ownership and many-to-many references. In the case of the destructor being invoked then the resource on the C side will be released and the Obj will become a null stub. In the correct context, this means that the Obj will then be removed by the rust runtime.

Implementing destructor for Obj

It is proposed that the destructor be implemented as follows:

Side effects of this proposal include:

nia-e commented 1 year ago

Hey @cydergoth, thank you so much for the detailed writeup! It definitely echoes some concerns I've had also. I mostly agree that we need to tackle dropping somehow and it's been a big headscratcher for me; I considered something tangentially related in #107 (specifically, this) which vaguely correlates.

On Obj, I have tried the approach of holding it in a Box<_> instead of just the pointer, but we currently don't handle dropping children of dropped objects properly and it resulted in segfaults all over the place. However, I'm not entirely certain that wrapping everything in an Arc is a great idea simply for performance reasons; this actually sort of used to be how we did things (though only for examples, so it was up to the user to decide how and what to do). Keeping everything behind fat pointers and dynamically handling memory has performance and memory overhead, and LVGL is supposed to run on baremetal with minimal hardware. Reference counting and mutexes are not particularly taxing, but costs add up. Keeping a global mutex is probably less memory-intensive but might degrade actual performance by a lot.

I do like the idea of keeping a global list of live objects, though we are burdened by the fact that - as you pointed out - we don't have an allocator by default except for LVGL's (which has access to very little memory). I'm unsure thus if it's possible to implement it for all cases (i.e. without alloc).

I do think there's a happy compromise possible to allow dropping and also not tax resources or performance too much. Rust's lifetimes do allow us to automatically invalidate a lot of things; changing e.g. Obj to hold raw: Box<_>, parent: PhantomData<&'a Obj> which is recursively invalidated on parent deletion might get us most of the way towards dynamic dropping (and we can simply have users wrap objects they want to live globally as ManuallyDrop<Obj>).

This doesn't address threading though, so I am curious what can be done there (current approach being "nothing, just don't touch LVGL-space from multiple threads"). There's also the issue of "what do we do if something manifests an object pointer out of thin air" which can occur (see the animation PR I recently merged, where via witchcraft I need to turn a *mut lv_obj_t into a &mut Obj where the original Obj is then leaked - leaking is useless now, but I specifically did it that way to be resilient against the future possibility of implementing real dropping). Here a global pointer table could be useful, though maybe we could also just pretend any witchcraft-derived Obj is 'static or make it ManuallyDrop and never drop it since outside of special-case exemptions it can "only" happen if it's secretly a duplicate of an existing object; but then we must somehow ensure said object isn't dropped before this ghost one is, so we need to make sure these ghost objects only exist in their own very short scope.

Honestly I'm not sure what to do, but I worry about the cost involved in dynamic checking of anything. Though if you want to, I would be curious to see benchmarks.

cydergoth commented 1 year ago

So the performance of rendering should be mostly on the C side, which would be using all native pointers. The extra overheard of Arc should only come into play when you need to make a call via the rust domain, which should be mostly during UI creation, right? After that it would just be small updates, with LVGL doing all the Rasterizing and rendering.

The mutex would need to be locked during rendering, then any events would have to wait for rendering to complete, but it would stop one thread like an event thread trying to update the LVGL model whilst another thread is deleting that object 😀

If you take a look at my repo, you'll see it uses an unsafe impl Send and async to update the UI. I don't think I could do that without async, unless I move completely to a command model and have the render thread poll a command queue for mutation requests - which whilst possible seems a little clunky as async effectively does the poll

nia-e commented 1 year ago

Send was actually implemented for Obj until a few days ago when I intentionally removed it cause I realised it could be dangerous; maybe creating a distinct LvThreaded<T> where T: Widget and allowing users to work with those behind mutexes etc might work?

cydergoth commented 1 year ago

Do you think prototyping with Arc would be valuable to determine precisely what the tradeoffs are?

nia-e commented 1 year ago

Sure. Though if there is any noticeable impact on memory use or speed, we will probably need this as an explicit opt-in

cydergoth commented 1 year ago

Yes, it definitely needs a feature gate, as it will need alloc and maybe mutex too.

On Mon, Jun 19, 2023, 12:54 PM Nia @.***> wrote:

Sure. Though if there is any noticeable impact on memory use or speed, we will probably need this as an explicit opt-in

— Reply to this email directly, view it on GitHub https://github.com/lvgl/lv_binding_rust/issues/111#issuecomment-1597549134, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPWAWVGQOVYMRTYAGPYC5TXMCG5BANCNFSM6AAAAAAXYFHNEY . You are receiving this because you were mentioned.Message ID: @.***>

nia-e commented 1 year ago

Would you be interested in working on this with me? I can ask that you be added as a collaborator

cydergoth commented 1 year ago

I'm a little busy with my actual job rn, but I'm hoping to get a chance to look at the weekend

On Tue, Jun 20, 2023, 7:03 AM Nia @.***> wrote:

Would you be interested in working on this with me? I can ask that you be added as a collaborator

— Reply to this email directly, view it on GitHub https://github.com/lvgl/lv_binding_rust/issues/111#issuecomment-1598645105, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPWAWXUQJ55Y2OJ6TNKHWDXMGGSFANCNFSM6AAAAAAXYFHNEY . You are receiving this because you were mentioned.Message ID: @.***>

nia-e commented 1 year ago

That's fair. This probably won't be getting merged "soon", but I'll open a dedicated issue for threading and we can discuss it there.

nia-e commented 1 year ago

See #140. I'll get started on trying to figure things out, but it's probably going to be a while before it works. But in case you want to help out with code - @kisvegabor can you give @cydergoth write perms on branches? I can't manage access for people w/ my current perms.

kisvegabor commented 1 year ago

@nia-e I've added Admin rights for you. Now you should be able to invite people and change permissions.

nia-e commented 1 year ago

Thanks! :D