astonbitecode / j4rs

Java for Rust
Apache License 2.0
645 stars 36 forks source link

Unsoundness in to_rust_string may cause UB due to unsafe assumptions about the input pointer #138

Open lwz23 opened 1 week ago

lwz23 commented 1 week ago

Description: The to_rust_string function uses CStr::from_ptr to convert a raw pointer (*const c_char) into a CStr. However, it does not validate that the pointer meets the necessary safety requirements. This can lead to Undefined Behavior (UB) if the pointer is invalid or if it does not point to a valid null-terminated C string. https://github.com/astonbitecode/j4rs/blob/68cdc1e362050bf04972f710a20dbf91aea12ce6/rust/src/utils.rs#L29

pub fn to_rust_string(pointer: *const c_char) -> String {
    let slice = unsafe { CStr::from_ptr(pointer).to_bytes() };
    from_java_cesu8(slice).unwrap().to_string()
}

Problem Description:

  1. Invalid Pointer: The function assumes that the input pointer is valid and non-null. If pointer is null or points to invalid memory, CStr::from_ptr(pointer) will invoke Undefined Behavior.
  2. Missing Null-Termination: The function assumes that the memory referenced by pointer is null-terminated. If the string is not properly null-terminated, CStr::from_ptr may read beyond the allocated memory, causing Undefined Behavior.
  3. Invalid UTF-8 Input: The function uses from_java_cesu8(slice).unwrap(). If the input slice contains invalid CESU-8 data or if from_java_cesu8 panics, the program may crash.
  4. Error Handling: The function calls .unwrap() on the result of from_java_cesu8(slice). If from_java_cesu8 returns an Err, the program will panic instead of gracefully handling the error. Steps to Reproduce: Provide a null pointer as input:
    let null_pointer: *const c_char = std::ptr::null();
    let result = to_rust_string(null_pointer); // UB: Null pointer

    Expected Behavior: The function should validate the input pointer and ensure it points to a valid, null-terminated C string. The function should handle invalid CESU-8 data gracefully instead of panicking. Additional Notes: The current implementation assumes valid input, which makes the function unsafe. Adding proper validation will make it more robust and prevent potential crashes or undefined behavior. If performance is a concern, consider using assertions in debug builds but validating input in release builds.

lwz23 commented 1 week ago

ping?

astonbitecode commented 1 week ago

Thanks for the report. I will have a look.

lwz23 commented 1 week ago

Thanks, please also take a look at https://github.com/astonbitecode/j4rs/issues/140, I think they are samilar : )