Make unsafe APIs more narrow in scope

vfx-rs / openexr-rs

BSD 3-Clause "New" or "Revised" License

10 stars 0 forks source link

Make unsafe APIs more narrow in scope #1

Open cessen opened 2 years ago

cessen commented 2 years ago

Hello! First: thanks for all the work you guys have put into building this new version of the Rust bindings to OpenEXR. I maintained the previous set of bindings along with Benjamin Saunders, so I know how much work it can be, and how hard it can be to get right!

Anyway, browsing through the documentation, I was confused that Slice's constructors weren't marked as unsafe, since they take raw pointers which can't be bounds checked.

Of course, it's perfectly fine to create and store raw pointers--doing so isn't unsafe. It isn't until you operate on them that it actually becomes unsafe. So after a bit of thought, it became clear that, indeed, the Slice constructors aren't unsafe in isolation. This crate instead makes the choice to mark e.g. write_pixels() as unsafe, since that's where the Slice is actually operated on.

While this is perfectly fine from a technical standpoint, from an API standpoint it doesn't feel great. If something goes wrong (e.g. a segfault) when calling write_pixels(), then from the standpoint of someone who isn't familiar with the crate's implementation, almost anything done up to that point could be the source of the problem. Any of the things passed to write_pixels() could be bogus in some way. In other words, it makes the possible sources of unsafety very broad and non-specific.

We can make the unsafety in the API more narrow (in some sense) by marking write_pixels() as safe, and instead marking as unsafe any places where (unsafely) bogus data can be constructed. For example, the Slice constructors that take raw pointers.

This is similar to the approach taken by e.g. str slices in Rust. Instead of marking every method that operates on str slices unsafe, they just mark functions that have the potential to construct bogus str slices as unsafe. This is a much narrower surface area in the API, and makes it far more specific where something may have gone wrong.

Granted, Slice will need to be changed a bit to accommodate this. Specifically, it currently lacks a lifetime parameter. But it should probably have a lifetime parameter anyway: it does, after all, represent a reference to data.

This would also open up the possibility of actually safe Slice constructors, that simply take a (Rust) slice to the data rather than a raw pointer, and can properly validate things. This would provide a fully safe API through the entire OpenEXR pipeline, and would just generally make the APIs harder to accidentally misuse.

scott-wilson commented 2 years ago

Hey Nathan,

Thanks for letting us have the OpenEXR crate, and bringing up this issue.

I could be wrong, but the APIs you brought up are something we're looking into marking as unsafe. We're thinking of allowing certain APIs in the bindings we write be marked as unsafe, and to have a safe layer written in top at a later date (discussion is here: https://github.com/vfx-rs/organization/issues/9)

I am not sure which API we are planning on pulling out and mark unsafe instead, but it could be this one. Anders would know more.

On Thu, Oct 14, 2021, 12:35 PM Nathan Vegdahl @.***> wrote:

Hello! First: thanks for all the work you guys have put into building this new version of the Rust bindings to OpenEXR. I maintained the previous set of bindings along with Benjamin Saunders, so I know how much work it can be, and how hard it can be to get right!

Anyway, browsing through the documentation, I was confused that Slice https://docs.rs/openexr/0.10.1/openexr/core/frame_buffer/struct.Slice.html's constructors weren't marked as unsafe, since they take raw pointers which can't be bounds checked.

Of course, it's perfectly fine to create and store raw pointers--doing so isn't unsafe. It isn't until you operate on them that it actually becomes unsafe. So after a bit of thought, it became clear that, indeed, the Slice constructors aren't unsafe in isolation. This crate instead makes the choice to mark e.g. write_pixels() as unsafe, since that's where the Slice is actually operated on.

While this is perfectly fine from a technical standpoint, from an API standpoint it doesn't feel great. If something goes wrong (e.g. a segfault) when calling write_pixels(), then from the standpoint of someone who isn't familiar with the crate's implementation, almost anything done up to that point could be the source of the problem. Any of the things passed to write_pixels() could be bogus in some way. In other words, it makes the possible sources of unsafety very broad and non-specific.

We can make the unsafety in the API more narrow (in some sense) by marking write_pixels() as safe, and instead marking as unsafe any places where (unsafely) bogus data can be constructed. For example, the Slice constructors that take raw pointers.

This is similar to the approach taken by e.g. str slices in Rust. Instead of marking every method that operates on str slices unsafe, they just mark functions that have the potential to construct bogus str slices as unsafe. This is a much narrower surface area in the API, and makes it far more specific where something may have gone wrong.

Granted, Slice will need to be changed a bit to accommodate this. Specifically, it currently lacks a lifetime parameter. But it should probably have a lifetime parameter anyway: it does, after all, represent a reference to data.

This would also open up the possibility of actually safe Slice constructors, that simply take a (Rust) slice to the data rather than a raw pointer, and can properly validate things. This would provide a fully safe API through the entire OpenEXR pipeline, and would just generally make the APIs harder to accidentally misuse.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/vfx-rs/openexr-rs/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAVFQOCV2SC6JPD7UCQWSG3UG4WIZANCNFSM5GAM45IQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

anderslanglands commented 2 years ago

Hi Nathan, thanks for your feedback! As Scott said, I'm actually doing exactly this right now. If you've got any other thoughts we'd love to hear them given your previous experience.

I'm taking the approach of marking only the functions that can actually cause memory issues as unsafe, which in this case is basically read/writre_pixels and friends.

I did toy around a bit with adding lifetimes to Slice, but in the end decided against it as you quickly end up in borrow checker hell if you actually want to do anything with the memory the Slice is referencing.

I started working on a different safe API called "Frame" which you can see here https://github.com/vfx-rs/openexr-rs/blob/0a0590b99841b9e698eb1faa07df3cc91c34bb98/src/core/frame_buffer.rs#L560

This takes a slightly different approach of taking ownership of the memory that will be read to/written and wrapping up the multi-stage Slice stuff in a typestate so that the API can be completely safe. This works pretty well for flat images, but is a bit more troublesome for deeps because of the dependency between the sampleCount Slice and the others. It's also obviously a little more limited than the general Slice API because you have to transfer ownership of the memory to the FramBuffer and then back again, which depending on what you're doing may or may not be a problem, which is why I've left the original unsafe API there too.

Longer term we're hoping to help upstream rework the C++ FrameBuffer API a bit to be safer for all (it seems everyone involved realizes it's not the best design as it currently stands).

cessen commented 2 years ago

I started working on a different safe API called "Frame" which you can see here

Ah! That's great! I didn't notice that on my first pass through the documentation. That's what I get for skimming, ha ha.

It would still be nice to have a safe API that doesn't require openexr to actually allocate and own the image data itself. But for a lot of use cases that's probably fine.

which is why I've left the original unsafe API there too

Oh yeah, for sure. I'm definitely not advocating to remove the unsafe versions of the APIs--they're really useful. We had a similar distinction in our previous wrapper: the "happy path" completely safe APIs, and the "full flexibility" unsafe APIs. I think that's a great approach.

I did toy around a bit with adding lifetimes to Slice, but in the end decided against it as you quickly end up in borrow checker hell if you actually want to do anything with the memory the Slice is referencing.

Ah, yeah, that's fair. The way we dealt with that in the old bindings was that we didn't expose the slice concept at all, and just directly used FrameBuffer. For example, in the Examples section here you can see this bit of code:

let mut fb = FrameBuffer::new(256, 256);
fb.insert_channels(&["R", "G", "B"], &pixel_data);

The insert_channels method internally builds the slices from the RGB pixel_data, so the client code never has to even care about them. The FrameBuffer then holds a reference to pixel_data, so the borrow checker will block certain things, but in general the FrameBuffer is going to be discarded after reading/writing the file anyway, so it doesn't end up being a problem.

I'm not sure if that would be a good fit for the new bindings, which feel like a thinner layer over the C++ API than the old bindings. But maybe there's some inspiration you can take from that. Here's the old FrameBuffer API docs if you want to take a peek.

anderslanglands commented 2 years ago

Thanks! Honestly I was also thinking to just completely internalize FrameBuffer to the read/write methods. Covers 99% of use cases, is simple and can be completely safe...

cessen commented 2 years ago

Do you mean instead of eliminating Slice you eliminate FrameBuffer from the happy-path API?

If we make Slice a little smarter (rather than just being a thin wrapper over the C++ type) I could see that working. Specifically, I'm thinking of having Slice itself know how many channels it's pointing to in interleaved data. Then perhaps read/write calls could look something like this:

exr_file.write(&[
    (&["R", "G", "B"], Slice::from_interleaved_f32(3, &pixel_data)),
    (&["Z"], Slice::from_f32(&z_data)),
]);

other_exr_file.read(&[
    (&["R", "G", "B"], SliceMut::from_interleaved_f32(3, &mut other_pixel_data)),
    (&["Z"], SliceMut::from_f32(&mut other_z_data)),
]);

Or maybe Slice could be even smarter, and hold references to the channel names as well:

exr_file.write(&[
    Slice::from_interleaved_f32(&["R", "G", "B"], &pixel_data),
    Slice::from_f32("Z", &z_data),
]);

In both cases, we also then have the benefit of eliminating heap allocation due to the FrameBuffer. (At least in theory. In practice we'd still be using the C++ FrameBuffer internally, which would heap-allocate. But at least it wouldn't be mandatory on the Rust side.) And the Slice types themselves are still Copy (they don't own any data, they just contain references).

It would also make it easier to give Slice a lifetime (and thus making the read/write calls fully safe) since in the common case it's constructed just-in-time to pass to the read/write calls. And in cases where you need to work around the borrow checker, unsafe Slice constructors could be used to build Slices with 'static lifetimes when necessary.

My gut feeling is that basically what we want is to somehow flatten the whole FrameBuffer/Slice API into just one type. In the old openexr-rs bindings I tried to do that by getting rid of the Slice type. But I think you're right, that getting rid of the FrameBuffer type instead actually makes a lot more sense. And I don't think(?) we even lose any API flexibility that way. It also seems like the only unsafe escape hatches needed for full flexibility with this approach would be unsafe Slice constructors...?

(I could, of course, be missing something here.)