ruuda / claxon

A FLAC decoder in Rust
Apache License 2.0
287 stars 28 forks source link

[wishlist] Drop the unsafe ensure_buffer_len() #13

Closed Shnatsel closed 6 years ago

Shnatsel commented 6 years ago

It would be nice to refactor ensure_buffer_len() into safe code so that bugs such as #10 are statically ruled out by the compiler, without sacrificing performance.

This is interesting from the API design and best practices perspective. Writing data into a preallocated buffer is a very common thing to do in systems programming, so the compiler and stdlib should support a safe and idiomatic way to do it. This may be a good case study for unsafe code guidelines WG and/or the emerging security working group.

Some ideas to get the ball rolling:

  1. Use buffer initialization such as vec![0, len] that the stdlib optimizes into as cheap memory zeroing as possible. Sadly this is still going to involve some overhead, unless we convince the optimizer that the entire buffer is always overwritten.
    1. Instead of setting the buffer length in advance, preallocate a vector and use the Write trait or extend_from_slice() to safely write into it without zeroing memory.
ruuda commented 6 years ago

Good points. Some thoughts:

Thinking out loud here: perhaps it would be possible to have a kind of must-write slice that must be written entirely, which can be split up with methods like split_at, and which must be filled with a method that takes a closure to produce the value, so all values elements are filled. The tricky part for Claxon is that after filling it we do want a mutable slice (to add predictions to the residuals). A method fill(self, F: FnMut() -> T) -> &mut [T] would work for one slice, but if these were split from a bigger must-write slice, then we can no longer convert the mutable slices back into one mutable slice ... I suspect it might be possible with more closures, analogous to the scoped threadpool API, but that could become hairy.

Also, while zeroing is a valuable defence against the possibility of exposing uninitialized memory in the presence of bugs, it does not prevent the real bug, which was not overwriting the entire buffer. A solution that can do the latter would be nice.

Shnatsel commented 6 years ago

Is there a particular reason why you went with slicing the buffer and passing the slices to decoding functions instead of passing the buffer to decoding functions and using buffer.extend() or similar? Extend trait for reference

Shnatsel commented 6 years ago

I've rewritten passing around a slice of uninitalized memory into appending to a Vec. This has killed off some fiddly math in the process. The result can be found here: https://github.com/Shnatsel/claxon/tree/safe-and-sliceless

There's just one problem: it fails some decoding tests, and I'm not sure how to isolate the problem because decode_* functions do not have unit tests for them.

ruuda commented 6 years ago

Closed by https://github.com/ruuda/claxon/commit/fec24678c7086aa4b2528bd7542f31d978b81b90.