gtk-rs / gtk4-rs

Rust bindings of GTK 4
https://gtk-rs.org/gtk4-rs/
MIT License
1.75k stars 168 forks source link

[BUG] `EntryBufferImpl::length` and `EntryBufferImpl::text` must be consistent #566

Open mahkoh opened 2 years ago

mahkoh commented 2 years ago

These methods are currently allowed to return any value:

https://github.com/gtk-rs/gtk4-rs/blob/fb59005041d06bc275de3ff23a740395b6332c1e/gtk4/src/subclass/entry_buffer.rs#L18-L26

But GTK requires the value returned by gtk_entry_buffer_get_text to be valid for gtk_entry_buffer_get_length bytes:

https://github.com/GNOME/gtk/blob/8550a04bf42f5e1d8af239ab953c3ff1235dc149/gtk/gtktext.c#L2140-L2155

sdroege commented 2 years ago

See https://github.com/gtk-rs/gtk4-rs/issues/567#issuecomment-909956836 for a possible way to fix this implementation.

mahkoh commented 2 years ago

In general, there is an understanding among C users that if a function call returns a pointer to an internal buffer, then that pointer will remain valid until the next call to a mutating function. In particular, the function call itself is not a mutating call and thus calling it repeatedly without an intermediate mutating call must leave all of the returned pointers valid.

I see no way to achieve without making all Impl traits unsafe. One might first assume that only making traits such as EntryBufferImpl unsafe would be enough. However, C might call into any other non-mutating function of any other Impl implementation which could hold on to a reference to the EntryBuffer and call a mutating function on it.

Hence, all Impl traits should be unsafe and contain notes about the rules they must adhere to. In particular for EntryBuffer:

/// ...
///
/// # Safety
///
/// Describe here how long the pointer returned to by `text` must be valid and the relationship with `length`
pub unsafe trait EntryBufferImpl: EntryBufferImplExt + ObjectImpl {
    /// This function returns a pointer now, the trampoline passes this pointer on.
    fn text(&self, entry_buffer: &Self::Type) -> *const u8 {
        self.parent_text(entry_buffer)
    }

    // no change for the other functions
}

Note that the documentation of gtk_entry_buffer_get_text requires this function to return the same pointer every time until the entry buffer emits a signal: https://docs.gtk.org/gtk4/method.EntryBuffer.get_text.html

sdroege commented 2 years ago

Hence, all Impl traits should be unsafe

It's only a problem for anything involving transfer none getters, i.e. mostly a problem for GTK because it's designed around this. Making all impl traits unsafe just because of a bad design in GTK is not going to happen :) Especially because GTK bindings can never actually be 100% safe and sound due to its design.

However, C might call into any other non-mutating function of any other Impl implementation which could hold on to a reference to the EntryBuffer and call a mutating function on it.

If you unsafe impl EntryBuffer, then all your other implementation needs to adhere to this too, yes. That doesn't mean that the other impl traits need to be unsafe if they're OK by themselves.


For EntryBuffer, someone will have to figure out a way to properly expose safe bindings for it if possible, or make it an unsafe trait with raw pointers (which is probably the best solution here anyway to avoid all the copying...), or just remove it.

mahkoh commented 2 years ago

Making all impl traits unsafe just because of a bad design in GTK is not going to happen :)

Personally I'm fine with this since I use Rust for its other qualities not because I expect 100% safety. However, I believe that such things should be noted prominently in the documentation because there might be an expectation among the wider rust community that things are 100% safe unless unsafe is used.

If you unsafe impl EntryBuffer, then all your other implementation needs to adhere to this too, yes.

That is incorrect. Other impls interact with EntryBufferExt and not EntryBufferImpl. The rules of EntryBufferImpl do not concern them and the function EntryBufferExt::insert_text is safe. It is not required for you to unsafe impl EntryBufferImpl to cause undefined behavior here. Calling EntryBufferExt::insert_text on any entry buffer (not necessarily subclassed in rust) is undefined behavior if it happens from within a non-mutating overridden virtual gtk method.

sdroege commented 2 years ago

That is incorrect. Other impls interact with EntryBufferExt and not EntryBufferImpl. The rules of EntryBufferImpl do not concern them and the function EntryBufferExt::insert_text is safe. It is not required for you to unsafe impl EntryBufferImpl to cause undefined behavior here. Calling EntryBufferExt::insert_text on any entry buffer (not necessarily subclassed in rust) is undefined behavior if it happens from within a non-mutating overridden virtual gtk method.

Can you give an example of that? That sounds like the whole entry buffer API is a safety hazard and contaminating completely unrelated code.

sdroege commented 2 years ago

Also note that I know basically nothing about GTK and I'm not really using it :) My main interest is in other parts of the stack.

mahkoh commented 2 years ago

Consider the following C function

char f(GtkEntryBuffer *buffer, GtkWidget *widget) {
    char *text = gtk_entry_buffer_get_text(buffer);
    int len = gtk_entry_buffer_get_length(buffer);
    gtk_widget_measure(widget, GTK_ORIENTATION_HORIZONTAL, -1, 0, 0, 0, 0);
    if (len) {
        return text[0];
    }
    return 0;
}

And the following rust code:

use imp::WiImp;
use gtk4::subclass::prelude::*;
use gtk4::prelude::*;
use glib::{Value, Object};
use gtk4::{Orientation, EntryBuffer, Widget};
use glib::translate::ToGlibPtr;

mod imp {
    use std::cell::RefCell;
    use gtk4::EntryBuffer;

    #[derive(Default)]
    pub struct WiImp {
        pub eb: RefCell<Option<EntryBuffer>>,
    }
}

#[gtk4::glib::object_subclass]
impl ObjectSubclass for WiImp {
    const NAME: &'static str = "WiImp";
    type Type = Wi;
    type ParentType = gtk4::Widget;
    type Interfaces = ();
}

glib::wrapper! {
    pub struct Wi(ObjectSubclass<WiImp>) @extends gtk4::Widget;
}

impl ObjectImpl for WiImp { }

impl WidgetImpl for WiImp {
    fn measure(&self, widget: &Self::Type, orientation: Orientation, for_size: i32) -> (i32, i32, i32, i32) {
        let mut eb = self.eb.borrow_mut();
        if let Some(eb) = &mut *eb {
            eb.insert_text(0, &"a".repeat(4096));
        }
        self.parent_measure(widget, orientation, for_size)
    }
}

extern {
    // does not modify buffer or widget, transfer none
    fn f(buffer: *mut gtk4::ffi::GtkEntryBuffer, widget: *mut gtk4::ffi::GtkWidget) -> u8;
}

fn main() {
    gtk4::init().unwrap();

    let eb: EntryBuffer = Object::new(&[]).unwrap();
    let wi: Wi = Object::new(&[]).unwrap();
    {
        let imp = WiImp::from_instance(&wi);
        let mut inner = imp.eb.borrow_mut();
        *inner = Some(eb.clone());
    }
    unsafe {
        f(eb.to_glib_none().0, wi.unsafe_cast_ref::<Widget>().to_glib_none().0);
    }
}

Of course the same principle applies to many objects, not just entry buffer.

sdroege commented 2 years ago

Thanks for the example. So yes, this applies to all the objects that have transfer none return values that must stay valid across further function calls. I don't think that can be prevented completely, and this is also exactly the reason why transfer none return values in the bindings around C functions are immediately taking a copy of the return value.

Not sure what we can do about that here other than documenting it and making it less likely to accidentally break in entry buffer subclasses in Rust.

sdroege commented 2 years ago

So this actually also means that there should be a separate issue for it not being just broken in the Impl trait but generally a problem with subclassing in GTK. In that issue we could collect all known cases at least...