tox-rs / rstox

Rust bindings to c-toxcore
GNU General Public License v3.0
17 stars 7 forks source link

all get_<string> functions are broken #39

Closed JFreegman closed 3 years ago

JFreegman commented 3 years ago

e.g.

let mut name: Vec<u8> = Vec::with_capacity(size);

tox_try!(err, ll::tox_conference_peer_get_name(
    self.raw,
    conference_number,
    peer_number,
    name.as_mut_ptr(),
    err.as_mut_ptr()
));
Ok(String::from_utf8_unchecked(name))

The resulting string is always empty. I've confirmed that the toxcore functions succeed, but for whatever reason the result never gets copied to the vector.

My proposed fix would be to change all such instances to:

let mut name = MaybeUninit::<[u8; MAX_NAME_LENGTH]>::uninit();

tox_try!(err, ll::tox_conference_peer_get_name(
    self.raw,
    conference_number,
    peer_number,
    name.as_mut_ptr() as *mut u8,
    err.as_mut_ptr()
));

Ok(String::from_utf8_lossy(&mut name.assume_init()[..size].to_string()))

However, this looks ugly to me. I'm not sure if it's safe or the proper solution.

kpp commented 3 years ago

I've confirmed that the toxcore functions succeed, but for whatever reason the result never gets copied to the vector.

You have to call an unsafe https://doc.rust-lang.org/stable/std/vec/struct.Vec.html#method.set_len with the resulting size. I am not quite sure where to get it from the provided code.

I'm not sure if it's safe or the proper solution.

&mut name.assume_init() is UB because not all of the elements would be initialized. You can use https://doc.rust-lang.org/stable/std/slice/fn.from_raw_parts.html or take a look at https://docs.rs/bstr/0.2.13/bstr/

kpp commented 3 years ago

I can smell UB in https://github.com/tox-rs/rstox/blob/2e3dc013ce324dd1d5b01a08c188e7dbee02abed/src/core.rs#L1122 =)

suhr commented 3 years ago

Isn't this an UB only in the case when name is not an UTF-8 string?

kpp commented 3 years ago

Well, name was filled with bytes but its len was not updated. So it will always be an empty string.


However if a name is changed between tox_conference_peer_get_name_size and tox_conference_peer_get_name, the name cannot be placed in the buffer so we will get an UB.

JFreegman commented 3 years ago

&mut name.assume_init() is UB because not all of the elements would be initialized.

Is it UB even when we never access the uninitialized elements after calling assume_init()? If so, how would using from_raw_parts() fix this, since you need to call assume_init() to create the slice?

I don't see the difference between:

let slice = from_raw_parts(name.assume_init().as_mut_ptr(), size);
Ok(String::from_utf8_lossy(slice).to_string())

and

Ok(String::from_utf8_lossy(name.assume_init()[..size]).to_string())

Edit: Upon further research, it seems that it is safe to call assume_init() on a partially uninitialized byte array because a u8 cannot have an invalid state.

kpp commented 3 years ago

I'd rather consult with Ralf Jung because Miri does not catch this kind of UB at the moment.

JFreegman commented 3 years ago

There seems to be an open discussion about this issue: https://github.com/rust-lang/unsafe-code-guidelines/issues/71

I'll leave my PR up, merge or not is up to you. If anyone proposes a more well-defined solution we can go with that.

kpp commented 3 years ago

There is a 100% safe way: let mut name = [0u8; MAX_NAME_LENGTH];

JFreegman commented 3 years ago

There is a 100% safe way: let mut name: [u8; MAX_NAME_LENGTH] = [0; MAX_NAME_LENGTH];

Ok we'll go with that, PR updated

kpp commented 3 years ago

Fixed in https://github.com/tox-rs/rstox/pull/40