rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.07k stars 12.54k forks source link

Segfault in je_rallocx after pushing new value to Vec<u8> causing it to resize #27878

Closed ashleysommer closed 9 years ago

ashleysommer commented 9 years ago

Im getting this very interesting segfault when using with CStrings in a project Im working on.

Program received signal SIGSEGV, Segmentation fault.
0x0000555555574551 in je_rallocx ()
Missing separate debuginfos, use: dnf debuginfo-install fcgi-2.4.0-28.fc23.x86_64
(gdb) bt
#0  0x0000555555574551 in je_rallocx ()
#1  0x000055555555dfc3 in fcgi::heap::reallocate (ptr=0x5555557af9fc "REQUEST_METHOD", old_size=14, size=28, align=1) at ../src/liballoc/heap.rs:71
#2  0x000055555555de5f in fcgi::raw_vec::RawVec<T>::double (self=0x7fffffffd188) at ../src/liballoc/raw_vec.rs:202
#3  0x000055555555dbf1 in fcgi::vec::Vec<T>::push (self=0x7fffffffd188, value=0 '\000') at ../src/libcollections/vec.rs:577
#4  0x000055555555d8dc in fcgi::DefaultRequest.Request::get_param (self=0x7fffffffd868, name=...) at /home/flubba86/workspace/rust-fcgi/src/lib.rs:153
#5  0x0000555555559c60 in testcase:main () at src/main.rs:19
#6  0x000055555556abc5 in rt::unwind::try::try_fn::h5614954417008942540 ()
#7  0x0000555555568829 in __rust_try ()
#8  0x000055555556a8b2 in rt::lang_start::ha172a3ce74bb453aK5w ()
#9  0x000055555555d0f7 in main ()

I am passing a static &str with the contents "REQUEST_METHOD" to get_param() in the fcgi lib. The get_param() method converts that to a CString using CString::new(). To do that, the rust CString new() constructor method first converts the &str into a Vec<u8> using into(). It then does a .push(0) to append a null char, then finally returns a new CString using into_boxed_slice on the Vec.

The error occurs on the .push(0) line here. The Vec<u8> is created with an initial size being the length of the string, and to add the new char it must call double() on the Vec to make it bigger. When it does that, the allocator segfaults during the jemalloc reallocation call.

I isolated the issue further by first converting the &str to a Vec<u8> myself then doing .push(0) on it before passing it to the CString constructor. The push() still crashes when reallocating, as seen in the pasted trace above.

I cannot however, create a minimal reproduction of the error. It only does it in this one library. I have tried multiple basic test cases and they all seem to work fine.

I was using a Nightly from Saturday the 15th of August and I just updated to the Current nightly (18th of August) and it is still the same.

Any suggestions on how I can further debug this? Temporarily, is there any way of creating a CString from a &str without using an intermediate Vec?

bluss commented 9 years ago

If you have a crash like this, I might suspect some prior memory corruption. This is where you examine all your unsafe-marked code blocks to see if you have any bugs there. In well functioning Rust that should be the only place. If you want to share the code, I can help you with this kind of bug search.

bluss commented 9 years ago

Wow, this looks weird. fcgi::heap::reallocate (ptr=0x5555557af9fc "REQUEST_METHOD", I think this is a hint.. somehow the Vec's internal pointer is a pointer to a string literal (?).

bluss commented 9 years ago

Which fcgi code is this? I can't find a recent version (that compiles)

bluss commented 9 years ago

Ok this smells of incorrect use of CString::from_ptr, which is an unsafe-marked function

    /// Retakes ownership of a CString that was transferred to C.
    ///
    /// The only appropriate argument is a pointer obtained by calling
    /// `into_ptr`. The length of the string will be recalculated
    /// using the pointer.
ashleysommer commented 9 years ago

Thanks for your help. I will check other unsafe code in the project.

I have re-worked the old rust-fcgi project to get it to compile on current rust nightly. I am using my local version for now. I will upload the working code if you want to test it out.

I am not using CString::from_ptr, but it may be used internally somewhere. I am creating the CString with CString::new(), as per the code here, it creates a Vec<u8> using into().

I believe the Vec's internal pointer should read as a string literal in the trace output, because it is essentially [u8] internally and in this case that is equivalent to a *c_char so gdb evaluates it as such.

bluss commented 9 years ago

Good point. Yes we need the crashing code to be able to debug this.

ashleysommer commented 9 years ago

This is the rust-fcgi code I am using. https://github.com/ashleysommer/rust-fcgi

This is the test main.rs I am using to run it:

extern crate fcgi;
extern crate libc;
extern crate unix_socket;

use fcgi::{Request, DefaultRequest};
use unix_socket::{UnixListener};
use std::os::unix::io::{AsRawFd};

fn main() { 
    let listener = UnixListener::bind("/run/webapp/webapp.sock").unwrap();  
    println!("isCgi: {}", fcgi::is_cgi());
    let result = fcgi::initialize_fcgi();
    let mut request: DefaultRequest = Request::new_with_fd(listener.as_raw_fd()).unwrap(); 
    while request.accept() {
        println!("request uri    {:?}", request.get_param("REQUEST_URI"));
        println!("document root  {:?}", request.get_param("DOCUMENT_ROOT"));
        println!("script name    {:?}", request.get_param("SCRIPT_NAME"));
        println!("request method {:?}", request.get_param("REQUEST_METHOD"));
        let received = request.readall(); 
        println!("Received (size={})", received.len());
        if received.len() > 0 { 
            println!("8<------------------");
            println!("{}", received);
            println!("8<------------------");
        } 
        let body = "Content-type: text/html\r\n\r\n<header><title>Hello World</title></header>\r\n<body> <h1>Hello World!</h1>  </body>";
        let byte_count = request.write(body);
        request.error("Test error!");
        println!("number of bytes written {}", byte_count);
        request.flush(fcgi::StreamType::OutStream);
        request.finish();
    }
}
ashleysommer commented 9 years ago

It will be complex to set up a test, as you need to set up mod_proxy_fcgi in Apache to send requests to the listening socket. See here for instructions: http://httpd.apache.org/docs/trunk/mod/mod_proxy_fcgi.html

I am using this line in my apache config: ProxyPass "/webapp/" "unix:/run/webapp/webapp.sock|fcgi://localhost/" enablereuse=on

bluss commented 9 years ago

C Strings are a perilous territory in rust. You have to use unsafe to call ffi functions and to inspect returned raw pointers, it desensitizes you a bit. The danger here is that the type system doesn't help you. Raw pointers can represent borrowing or ownership transfer, and it's all in the documentation, not in the type.

bluss commented 9 years ago

@ashleysommer Does it work if you fix this https://github.com/ashleysommer/rust-fcgi/commit/ba1f119de014df77f78a2e7071ecf933364be80d#commitcomment-12760908 ?

ashleysommer commented 9 years ago

@bluss Yes! Thank you. That fixed it. That also explains why it runs fine for one or two calls to get_param() before segfaulting on the third call. It was using unowned memory as owned CStrings.

I found a second call to CString::from_ptr() in the read() function too.

You are right in pointing out that the (reference here)[https://github.com/rust-lang/rust/issues/27769] that the names of the functions could be improved to prevent this kind of confusion.

bluss commented 9 years ago

Yep, great! Let's close this as not a bug then.

Rust did work correctly — memory unsafety originated in an unsafe marked block; but the CString API can be much improved, and that's why I was drawing attention to this bug. It's only helpful, I hope you don't mind. These small traps are something we need to learn from when designing the API.

ashleysommer commented 9 years ago

Yes once again Rust is smarter than me. This was not really a bug as much as it was me not understanding the CString from_ptr semantics and the difference between CString and CStr. Is there any documentation that would have explained this if I took the time to read it before diving in?

bluss commented 9 years ago

Yes, the docs for CString::from_ptr that I quoted in this thread :)