meh / rust-xcb-util

Rust bindings and wrappers for XCB utility functions.
MIT License
27 stars 17 forks source link

`GetTextPropertyReply::name` incorrectly assumes data is always UTF-8 #21

Open athre0z opened 2 years ago

athre0z commented 2 years ago

When using wezterm and (accidentally) pasting with non-UTF8 data in my clipboard (e.g. a screenshot), the terminal would crash in std code due to a String that contains non-UTF8 data. I traced it back to this library, and particularly the following code:

https://github.com/meh/rust-xcb-util/blob/ad1f5089fdee4b87d38f4ba556b2df654c35573c/src/icccm.rs#L61-L62

It appears like the function incorrectly assumes that the payload is UTF-8 formatted when that is not necessarily true. My knowledge around X11 is extremely limited, and it's possible that wezterm is simply using the library incorrectly. However, it looks to me like the object is created via the get_text_property function, without any unsafe function calls involved. This makes me think that it's an unsoundness within this library.

Usage in wezterm:

https://github.com/wez/wezterm/blob/0a7e30e9bcc4dbc78c8611dc46e77ef7faa98b09/window/src/os/x11/window.rs#L587-L604

CC @wez

wez commented 2 years ago

It is possible for the application to verify that the target is UTF-8, so I've pushed that to wezterm.

I think it would be nice to expand the get_text_property interface to expose both the encoding of the text and the raw bytes to allow the application to "do something" to convert non-utf8 text if it wishes.

If there's resistance to changing this interface (eg: for backwards compat), I think the implementation should not use from_utf8_unchecked because it cannot prove that the data is UTF-8. It should probably use from_utf8().expect("data to be UTF-8") or something similar.

shinmao commented 11 months ago

Hi @wez @athre0z , does this issue also could happen in util::utf8::into()? https://github.com/meh/rust-xcb-util/blob/39d91e435fe75010f59380dd31f110b0d714574a/src/util.rs#L288-L291