gsingh93 / llvm

Safe LLVM bindings for Rust
MIT License
21 stars 6 forks source link

Wrap LLVMTypeRef #8

Closed JustAPerson closed 7 years ago

JustAPerson commented 7 years ago

Perhaps LLVMTypeRef should also be wrapped so that we could offer LLVMPointerType() and LLVMGetUndef() as methods which would be convenient. Any existing function taking LLVMTypeRef could be changed to take T: Into<TypeWrapper> so both LLVMTypeRef our the wrapper could be passed without breaking existing code.

I'd be open to making such a change. Just curious about feedback.

JustAPerson commented 7 years ago

I realize now that perhaps a trait is better? For example withLLVMValueRef:

pub trait LLVMValue {
    fn type_of(&self) -> LLVMTypeRef;
}
impl LLVMValue for LLVMValueRef {
    fn type_of(&self) -> LLVMTypeRef {
        unsafe {
            llvm::LLVMTypeOf(*self)
        }
    }
}
afonso360 commented 7 years ago

Hi,

LLVMTypeRef is somewhat wrapped in types.rs, but currently only context types are represented, I was planning on adding a GlobalTypes for types that don't need to be registered in a Context, these could have a From<LLVMTypeRef>.

Now, with the way im thinking about doing this, the GlobalTypes would not be able to take any parameters (Due to the function definition of From<>), but it appears that LLVMPointerType takes some additional arguments. Im not too woried about the first argument, we could just register a From<ContextType> or something similar, but the address space is kind of worrying, were do we get this information? It also means that we need to create a separate trait for registering pointer types, which im a bit reluctant to do.

Meanwhile in types.rs there is already a static function called pointer_type if you would like to use it.

About LLVMGetUndef it seems that it returns a LLVMValueRef, so we could just put it in the trait definition of Type with an automatic impl.

About your idea of taking Into<Type>, thats exactly where I wanted to take this, i just wanted to finish converting a few more types into Type before I implemented that.

--

Thinking about this a bit more, it seems unfeasable to have a trait Type that would wrap all other ContextType and GlobalType... etc. I think we can only have diffrent impl's that take each Type, It would be a lot of duplicate code, especially if we have to create a PointerType. The only solution that i can think for this is to have a struct Type that implements From for each of those traits, but i would still prefer to have a trait that would wrap all those traits.

JustAPerson commented 7 years ago

The more I experiment on my own, the less I like wrapping things in structs. At least until we've covered the entire llvm_sys API.

It's not strictly necessary when we could use traits, which are far more convenient for the user to interoperate with the llvm_sys crate since you're dealing with the raw points llvm_sys expects.

This is what I have for LLVMTypeRef currently

pub trait LLVMType {
    fn get_int_type_width(&self) -> u32;

    fn pointer_type(&self, address_space: u32) -> LLVMTypeRef;
    fn get_undef(&self) -> LLVMValueRef;
    fn array_type(&self, len: u32) -> LLVMTypeRef;
}

impl LLVMType for LLVMTypeRef {
    fn get_int_type_width(&self) -> u32 {
        unsafe {
            llvm::LLVMGetIntTypeWidth(*self)
        }
    }

    fn pointer_type(&self, address_space: u32) -> LLVMTypeRef {
        unsafe {
            llvm::LLVMPointerType(*self, address_space)
        }
    }

    fn get_undef(&self) -> LLVMValueRef {
        unsafe {
            llvm::LLVMGetUndef(*self)
        }
    }

    fn array_type(&self, len: u32) -> LLVMTypeRef {
        unsafe {
            llvm::LLVMArrayType(*self, len)
        }
    }
}

Also, regarding constants, it seems LLVMConstStringInContext doesn't really work with the existing value::IntoConstValue trait since that trait requires ContextType which isn't relevant. I guess strings would use a different method other than Context::cons()?

I've got a fair number of miscellaneous changes that are worth discussing. Do you use IRC or anything?

afonso360 commented 7 years ago

Intresting, I never tought about implementing the traits directly on the LLVM * Ref, my only reservation would be that it seems a bit non rustic because, if i as a user would start using a crate and see that the type was called LLVMTypeRef i would imediatly be aprehensive about, Can I Clone it, can I Copy it, are resources managed for me? and that is a big advantage of implementing the Drop trait and enforcing clones and copies (I haven't looked into the latter).

The other advantage of the struct method would be that you would be able to pass them into llvm_sys function with an .into() invocation, this is already true for most types in this crate, for example Target implements Into<LLVMTargetRef> and so you can just do target.into() and get your LLVMTypeRef

I don't see why LLVMConstStringInContext wouldn't work, it does require the ContextType, it does require the ContextRef. If you mean LLVMConstString then yeah, I dont think it would be relevant for the Context::cons function, but looking at the function description it appears to just create it in the global context which i would then propose to create a Into<LLVMValueRef> altough this might just be more confusing for the end user because would you then get a mutable string ref or a const string ref. We should probably implement the immutable one for &str and the mutable one for String?

I've created a Gitter room, ill add badges and probably CI to this repository tonight once I have some more time.

Znapi commented 7 years ago

Here's how I wrapped types in my fork. Since types are immutable, and LLVM already only gives us pointers to them, I just transmuted the LLVMTypeRefs into &Types, where a Type is just an opaque struct(LLVMType) (which ideally would be marked as unsized), and the lifetime of the Type is tied to the lifetime of the Context that the reference was pulled from. I made the hierarchy work by implementing Deref rather than Into on all the "subclasses" of Type, so it takes advantage of automatic deref coercions.

This is also a 0-cost abstraction as far as I can tell. I found using something like trait Type and implementing it on all "subclasses" suboptimal, because a passing around generic types as trait objects means Rust creates one of these and starts using dynamic dispatch.

afonso360 commented 7 years ago

Let me just say that I haven't looked into this in a long time, and i've forgotten the reason for all of my arguments above, so this is effectiveley a fresh look into this for me. A lot of what im saying is just duplicating what you have already done, however I just want to make sure I understand the problem fully.

Yes, using Traits directly in arguments causes trait objects to be created, however using generics just causes the compiler to duplicate code and create one version of the function for each generic argument type, this causes a lot of code bloat (Especially with multiple generic args), but i'm not too worried about that. Which makes it 0 cost, at least in run time and memory consumption (RAM, not binary size).

I like the idea of having a raw LLVMType that we then only use references with, I think this has a lot of benefits in that it gets the compiler to check soundeness for us. This is probably something we should implement now and end all public uses of LLVM*Refs. This is great in that its clear to the user that there is a lifetime and that the type is linked with the context.

Another great point you make is that we can really use Deref to our advantage. De-ref-ing &llvm::types::Half into LLVMTypeRef effectivley makes it so that our types are transparent (automatic deref coercions), and for functions that can take any LLVMTypeRef we can also implement a generic of the sort:

fn LLVMTypeIsSized<T: AsRef<LLVMType>>(type: T) -> bool;

(This is just a random function that i had opened in the browser) This is really great because it gets us the best of both worlds, automatic deref coercions make this transparent to the user, in that he doesen't have to call .into(), its also zero cost because of Monomorphization.

I think we should probably do that, and since you already seem to have an implementation we could start building a PR. However, looking at your implementation, there are some things that I think could be improved.

We don't really need the compiler plugin, from what it looks like its doing, a macro would suffice.

Having an as_raw seems to follow the api-guidelines, however i think we should also still implement Into<> for convenience.

Finally, This looks great! it looks like we can really have a good user experience using this! Let me know what you guys think about this: @JustAPerson @Znapi

-- One aditional thing, it would probably be important if we could override the &llvm::types::Half -> llvm::types::Half Deref and throw a build error, this would make it safer to handle these types.

Znapi commented 7 years ago

Compiler plugin? It's just a procedural macro to derive custom traits, although, yeah, a normal macro would work. I'm also partly using this as a way to learn more details about Rust, so that's why I went for a procedural macro. I'll make that change though.

Ideally, yeah, the user should not be able to dereference an &Type to get the opaque Type. This is why I would like to mark it as an unsized type somehow, but I don't know how to do that. This FIXME in the source for std::ffi::CStr has me believing that it's not possible yet, because otherwise I think they would have already made that simple fix.

I don't think I follow you about the deref coercions. I implemented Deref<Target=Type> on all the "subclasses", like types::Half, so that, for example, &types::Half can be passed to a function that takes an &Type without the user needing to explicitly make any conversions. Type is opaque, not transparent. Converting between &Types and LLVMTypeRefs requires explicit conversions. My reasoning for this was that, if the crate were complete, the user should never need to work with the raw LLVMTypeRefs. It also seems to be the way the rest of the crate works.

Also, thanks for the link to the api-guidelines! I didn't even know that existed.

-- So, I made a branch with just this change in it (since the rest of my fork is sorta experimental / a learning experience). I got rid of the procedural macro. I also got rid of the pub(crate) from_raw and as_raw in favor of From and Into traits, as per your recommendation, and because it seems to align better with the conventions of the rest of the crate. I also only did a pub use types::{Type, ContextType} instead of pub use types::* like all the other modules, because the glob import causes a name conflict between types::Function and Function.

I still need to consider type comparisons and casting from &Type to a more specific types at runtime though. Also, you talk about global types in LLVM above, but I'm not sure what you mean. Do they belong to the global context? If so, do I even need to bother implementing them? It seems that they all have context-specific counterparts, and I saw some mailing list mentioning how the global context is not thread safe anyways.

afonso360 commented 7 years ago

Did not know that, I tought compiler plugins were procedural macros, I never really went searching.

I tought we could implement something like this but no longer think that it's a good solution, it wouldn't be typechecked by the compiler, but it would prevent silent failure at runtime. (Either way, it doesen't work)

The nomicon on coercions doesen't really have any information on forcing Unsized. I would be ok with documenting this for now and leaving it be, but we need to have a way to prevent this.

I don't think I follow you about the deref coercions. I implemented Deref on all the "subclasses", like types::Half, so that, for example, &types::Half can be passed to a function that takes an &Type without the user needing to explicitly make any conversions

Thats exactly what I meant, just a note from the api-guidelines, we should probably take AsRef<Type>, as it might give us more flexibilty on accepting parameters.

Type is opaque, not transparent

I didn't mean it like that, i meant that llvm::types::* would automaticly convert into Type, not the best choice of words from me.

Converting between &Types and LLVMTypeRefs requires explicit conversions. My reasoning for this was that, if the crate were complete, the user should never need to work with the raw LLVMTypeRefs. It also seems to be the way the rest of the crate works.

👍

Also, you talk about global types in LLVM above, but I'm not sure what you mean. Do they belong to the global context?

Yes, i meant the functions that produce types but don't take a context, such as this, and yeah, they are just types registered against the global context.

I did not know they were thread safe, so i would just leave them for now, because we are probably going to have to mark them as !Sync in some way, but that's probably more towards the future.

Znapi commented 7 years ago

So, do you want me to get rid of the Deref impls and use AsRefs instead? Or just add AsRefs on? Using just AsRefs means a lot more monomorphization but for no benefit that I can see. I can't see how adding AsRefs but keeping the Derefs helps at all either.

Znapi commented 7 years ago

Also, I quickly added a method for downcasting, but I'm still open to suggestions on how the user should be able to perform downcasts. Right now, downcasting looks like this from the user's end:

let generic_type: &llvm::Type = i16::get_type_in_context(&context); // force upcast
println!("{:?}", generic_type);

if let llvm::types::Kind::Integer(t) = generic_type.downcast() {
    println!("{:?} {}", t, t.width());
}

This and the AsRef thing are the last couple of things to iron out, as far as I can tell.

afonso360 commented 7 years ago

So, do you want me to get rid of the Deref impls and use AsRefs instead? Or just add AsRefs on? Using just AsRefs means a lot more monomorphization but for no benefit that I can see. I can't see how adding AsRefs but keeping the Derefs helps at all either.

I mean't using AsRef<_> on function signatures. (We still need Deref's for this to work). The benefit from my point of view was that it essentially free and would provide extra flexibility for the end user. However, thinking about this a bit more, any use cases that i could come up with would require the same effort for the end user. So, yeah, lets just leave it.

On the downcasting thing, the first thing that i thought of was if there was any way of using Any to help us. However, it seems that the semantics on using Any aren't that great. Let me know if you think Any is any better than using your current approach, otherwise I would say that its ok as is.

-- On a second thought, I'm not sure we can use Any

Znapi commented 7 years ago

Alright, so I added another option to perform downcasts without pattern matching on enums, a bunch of pretty documentation (because no one's going to want to do it all from scratch later, and it helps document design decisions), and opened a pull request (#10).

Yeah, Any doesn't seem like it will work. At least not any better than using an enum. Just tell me if there's anything else you would like changed before merging.

Znapi commented 7 years ago

Oh, also, I just noticed, should we change the impl_context_type!(char => Integer, LLVMInt8TypeInContext); to use LLVMInt32TypeInContext? A char is 4 bytes in size

afonso360 commented 7 years ago

Yes, please, I really wasn't sure when I did that. Also, remove the comment above it //This might actually not be true, Not sure

I've had a look at it, and it looks good (Especially the documentation, great job!), so if you could commit that, i can merge it.

Thank you for doing this!

Znapi commented 7 years ago

Done, and you're welcome! :)

afonso360 commented 7 years ago

Merged in PR #10