TheDan64 / inkwell

It's a New Kind of Wrapper for Exposing LLVM (Safely)
https://thedan64.github.io/inkwell/
Apache License 2.0
2.37k stars 229 forks source link

Safe function wrapper #56

Closed RReverser closed 6 years ago

RReverser commented 6 years ago

I know PR #36 has been merged long ago, but I've been looking to solve problem with safety of dynamic functions similar to one described in #5 and decided to see what Inkwell does here.

While the Symbol trick is nice, unfortunately I don't think it actually helps with safety issue of outliving backing storage at all - it's still way too easy to accidentally get and store a raw function instead of a wrapper Symbol by accidentally putting * in front of the call like

let func = {
  // ...
  let execution_engine = /* ... */;
  // ...
  *execution_engine.get_function::<MyFunc>("sum").unwrap()
};

Despite the wrapper and private traits like private::Sealed, this works without any warnings / errors because Rust performs Deref coercion (which is implemented on Symbol), which, in turn returns one of function types , which Rust believes to be safely Copyable, and so lifetime information is immediately lost.

RReverser commented 6 years ago

Instead of allowing Deref, a safer approach might be to provide a call method directly on Symbol which would accept arguments and call the target function. Assuming same calling convention, it will be zero-cost, but will allow to avoid this hazard at the same time.

TheDan64 commented 6 years ago

Thanks for opening an issue & PR! I should hopefully have time to take a look later today. In the mean time, could you please base your PR against master? Thanks!

RReverser commented 6 years ago

In theory I could, but that would mean I need to get and potentially compile LLVM 7, which is likely not going to happen very soon as I'm already running out of disk space with various versions of LLVM, V8 and Rust :D I was hoping it doesn't matter for you which way to backport commits (assuming the approach is acceptable in the first place)? If not, I can try to build LLVM 7 later this week.

TheDan64 commented 6 years ago

Unfortunately I've made non trivial changes on master that require rustc 1.30. Those changes haven't yet been distributed to the llvm branches because 1.30 doesn't release until the end of the week. This may or may not make backporting also non trivial.

Fortunately, there is a simple way for you to test your changes on master without llvm 7: modify Cargo.toml to have llvm-sys version be "40.2", and be sure to supply the llvm4-0 feature flag (just don't commit that change)

If you prefer, feel free to wait until after I review your changes before doing so. I apologize for the inconvenience, unfortunately rust doesn't allow us to support multiple versions in a very straightforward way

TheDan64 commented 6 years ago

So I'm inclined to accept these changes, but I have a few thoughts.

1) Is into_closure a good idea? The fact that the closure is "safe" doesn't feel right. Being able to use the call method should be enough, I think. 2) I'm wondering if while we're touching this code, it make sense to rework how it works a bit. Currently the Symbol keeps an InnerExecutionEngine ref count, which means you can do the func example you pointed out in OP. (Though it's not necessarily unsafe due to the ref counted EE)

But I'm wondering if it makes sense to get rid of the ref count, and instead turn Symbol into a smart pointer with a fake lifetime, like so.

This would prevent the Symbol from going out of scope from the EE instance. (~Though it being Clone might break this guarantee~ edit: apparently not?)

I know it's been a while, but I'd greatly appreciate your thoughts, @Michael-F-Bryan. Symbol was your idea after all. (See the PR for implementation deets)

RReverser commented 6 years ago

Is into_closure a good idea? The fact that the closure is "safe" doesn't feel right. Being able to use the call method should be enough, I think.

I wrote about this in tradeoffs above. I wasn't sure myself, but in the end I think it's okay as it has semantic similarity with Box::from_raw, slice::from_raw_parts and other built-in std methods that allow you to "trust" certain pointers. In our case we do the same for function pointer, but actually even better than most of these methods since we keep the ref-counted part and so guarantee that backing storage of a function is not deallocated too early.

which means you can do the func example you pointed out in OP. (Though it's not necessarily unsafe due to the ref counted EE)

I'm not sure what you mean - do you mean it wasn't unsafe before or it's not unsafe now after the changes? To clarify my POV - it was unsafe before specifically because the ref-counted part could be dropped while function pointer remained; now this is not possible without explicit into_inner.

But I'm wondering if it makes sense to get rid of the ref count, and instead turn Symbol into a smart pointer with a fake lifetime, like so.

It could be possible, but then you'd lose ability to perform any further mutating ops on the ExecutionEngine if you tie lifetime to it. It might or might not be acceptable, just pointing this out.

TheDan64 commented 6 years ago

I wrote about this in tradeoffs above. I wasn't sure myself, but in the end I think it's okay as it has semantic similarity with Box::from_raw, slice::from_raw_parts and other built-in std methods that allow you to "trust" certain pointers. In our case we do the same for function pointer, but actually even better than most of these methods since we keep the ref-counted part and so guarantee that backing storage of a function is not deallocated too early.

I'm not sure, maybe Michael will have some input here.

I'm not sure what you mean - do you mean it wasn't unsafe before or it's not unsafe now after the changes? To clarify my POV - it was unsafe before specifically because the ref-counted part could be dropped while function pointer remained; now this is not possible without explicit into_inner.

It definitely was easy to copy the pointer before. It's just an idea, I'm not sure if making the Symbol a bit more restrictive adds more value or not.

It could be possible, but then you'd lose ability to perform any further mutating ops on the ExecutionEngine if you tie lifetime to it. It might or might not be acceptable, just pointing this out.

I don't think that's true, as we only use immutable references due to interior mutability (provided by raw pointers). So you'd still be able to continue to use the EE as normal

TheDan64 commented 6 years ago

Fixed by #57