TheDan64 / inkwell

It's a New Kind of Wrapper for Exposing LLVM (Safely)
https://thedan64.github.io/inkwell/
Apache License 2.0
2.32k stars 224 forks source link

`get_string_constant` is invalid #510

Open CertainLach opened 3 months ago

CertainLach commented 3 months ago

Describe the Bug A clear and concise description of what the bug is.

get_string_constant calls LLVMGetAsString, and returns its result as CStr

    pub fn get_string_constant(&self) -> Option<&CStr> {
        let mut len = 0;
        let ptr = unsafe { LLVMGetAsString(self.as_value_ref(), &mut len) };

        if ptr.is_null() {
            None
        } else {
            unsafe { Some(CStr::from_ptr(ptr)) }
        }
    }

This is not correct.

LLVM String means array of i8, which is not necessary NUL-terminated.

If, in Rust, we'll write "hello\0", it will result in the following LLVM IR:

@anon...212 = private unnamed_addr global <{ [6 x i8] }> <{ [6 x i8] c"hello\00" }>, align 1

This is a struct, where the only field is constant string (LLVMIsConstantString returns true). Calling LLVMGetAsString for it returns a pointer to start of "hello\0\0" buffer AND the actual length of this string = 6 However, get_string_constant disregards the fact that the returned pointer may point to a string containing intermediate NULs such as "foo\0bar" thus it is invalid to turn in into a CStr (the part after \0 gets truncated).

This function should have the following signature:

    pub fn get_string_constant(&self) -> Option<&[u8]> {
        let mut len = 0;
        let ptr = unsafe { LLVMGetAsString(self.as_value_ref(), &mut len) };

        if ptr.is_null() {
            None
        } else {
            // Returns [u8], because `cx.const_string` receives [u8].
            unsafe { Some(std::slice::from_raw_parts(ptr.cast(), len)) }
        }
    }