delta-io / delta-kernel-rs

A native Delta implementation for integration with any query engine
Apache License 2.0
144 stars 41 forks source link

inital take on some miri support #470

Closed nicklan closed 4 days ago

nicklan commented 1 week ago

What changes are proposed in this pull request?

This PR adds a test where we start to use miri to make sure we're not doing anything crazy in our FFI. Initially it just tests a couple of things, which already uncovered an issue (see below). We need to add lots more unit tests for the various ffi functions to get miri to validate what we're doing.

This uncovered that the way we were getting pointers in handles was actually unsound. Previously we would do something like:

 let val = ManuallyDrop::new(Box::new(val));
 val.as_ref().into()

To get the pointer. This is fine, but note that we are converting from &T into a NonNull<T> here, which is supported in the trait, but marks the memory as SharedReadOnly because we don't own the reference.

But to free the memory we use into_inner() which wants to do:

*Box::from_raw(ptr)

This takes Unique ownership of the pointer, which is not valid because we only created it in SharedReadOnly.

This PR switches to use Box::leak to get pointers, which constructs them with the correct memory tag. It's possible there are cases where we DO want the memory tagged as SharedReadOnly, but in that case I think we'd have to keep a reference around in rust to ever be able to free things, since experimenting indicates that all the from_raw calls (Arc, Rc, Box) want the memory to be at least SharedReadWrite.

Box in particular wants Unique. I need to do some more research about if Arc/Rc::into_raw produce a thin pointer or not, as they place less stringent requirements in their from_raw.

For now, to minimize change, I'm sticking with Box.

How was this change tested?

Added some unit tests

codecov[bot] commented 1 week ago

Codecov Report

Attention: Patch coverage is 86.66667% with 6 lines in your changes missing coverage. Please review.

Project coverage is 79.56%. Comparing base (5a114ef) to head (859d57b). Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
ffi/src/lib.rs 84.21% 6 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #470 +/- ## ========================================== + Coverage 78.95% 79.56% +0.61% ========================================== Files 56 56 Lines 12216 12253 +37 Branches 12216 12253 +37 ========================================== + Hits 9645 9749 +104 + Misses 2052 1978 -74 - Partials 519 526 +7 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

zachschuermann commented 1 week ago

woah very awesome! excited for this!!