TheDan64 / inkwell

It's a New Kind of Wrapper for Exposing LLVM (Safely)
https://thedan64.github.io/inkwell/
Apache License 2.0
2.23k stars 217 forks source link

Sub-Types (Type Safety v1.1) #8

Open TheDan64 opened 6 years ago

TheDan64 commented 6 years ago

WARNING: Brain dump ahead!

Today, build_int_add looks approximately like build_int_add(&self, left: &IntValue, right: &IntValue) -> IntValue. This is great, because it'll stop you from trying to add a FloatValue and an IntValue, for instance which would not work in LLVM. But I wonder if we can take type checking a step further! What happens when left is a u8 and right is a u16? Needs to be verified, but I believe this is also a LLVM error because it doesn't make much sense. How would adding two values of different sizes work (without casting)?

Therefore, I wonder if we can add sub-type annotations using generics (here-on known as sub-types). For example, IntType<u32> and IntValue<u32> so that you could express something like: build_int_add<T>(&self, left: &IntValue<T>, right: &IntValue<T>) -> IntValue<T> which would ensure only the same sub-types are added together when needed. So, IntValue<u8> and IntValue<u8> would be able to be added together but it'd prevent IntValue<u8> and IntValue<u16> from being added together, requiring a cast first. And, in case you do want different sub-types to be valid input you would just specify separate type variables: do_foo<L, R>(&self, left: &IntType<L>, right: &IntType<R>)

In terms of implementation details, the sub-type should basically just be a marker and not take up any additional space (as PhantomData if needed) (type parameters are cool!)

Outstanding questions:

TheDan64 commented 6 years ago

num crate doesn't look like it could help but typenum defines types for a lot of int sizes. So we could probably use those for custom width types: IntValue<U9>, IntType<I30> along with the builtins: IntType<u32>, etc. Also, I don't think LLVM supports custom width floats (at least not in 3.7) so those would only use builtins, though there's no f16 or f128 yet.

TheDan64 commented 6 years ago

Also, it seems like signed types should be explicit, though LLVM doesn't make this distinction.

TheDan64 commented 6 years ago

It's worth noting StructTypes (and probably StructValues) have two interesting properties:

TheDan64 commented 6 years ago

After looking at #32, it seems that build_global_string and related build_global_string_ptr methods segfault when called without an active function being built on for some reason (even though the string is global 👀). One possible solution would be to have a subtype for a builder in the context of a function, ie Builder<Global> & Builder<Function> similarly to how librustc_trans has two methods for creating builders (global and function scoped): https://github.com/rust-lang/rust/blob/master/src/librustc_trans/builder.rs#L54-L77

Those two methods could be implemented for only the latter Builder Function subtype.

Michael-F-Bryan commented 6 years ago

I know gtk-rs encountered a similar problem when emulating inheritance in Rust. They came up with an IsA trait which lets you use a child class as its parent.

So you might write a function that takes anything which IsA<IntType>, allowing you to accept a u32, i64, etc.

TheDan64 commented 6 years ago

Inkwell use to have something like that, however the issue was primarily that the trait would require you to assume the global context which may not always be desired and wasn't super intuitive (why does u64 suddenly have the global context even though my other types have been working with my non global context?)

TheDan64 commented 3 years ago

We're going to want Builder subtypes, as an unpositioned builder can cause segfaults in many scenarios.