eclipse / paho.mqtt.rust

paho.mqtt.rust
Other
516 stars 102 forks source link

Properties memory isn't freed #148

Closed CodeTako closed 2 years ago

CodeTako commented 2 years ago

There are memory issues when using Properties with a message. The memory for the properties doesn't seem to get properly freed when the struct is dropped.

Here's an example that shows the undesired behavior with both paho_mqtt v0.9.1 and v0.10:

use std::thread::sleep;
use procfs;
use std::time::{Duration, Instant};
use paho_mqtt::{properties, Properties};

const MEM_CHECK_INTERVAL: Duration = Duration::from_secs(1);

fn main() {
    let mut last_check = Instant::now();

    loop {
        print_mem(&mut last_check);

        surprise_mem_increase();

        sleep(Duration::from_micros(100));
    }
}

fn print_mem(last_check: &mut Instant) {
    if last_check.elapsed() > MEM_CHECK_INTERVAL {
        *last_check = Instant::now();
        let bytes_per_page = procfs::page_size().unwrap() as u64;

        let proc = match  procfs::process::Process::myself() {
            Err(e) => return,
            Ok(p) => p,
        };

        let mem = match proc.statm() {
            Err(_) => return,
            Ok(stat) => stat,
        };
        let used = (mem.data * bytes_per_page / 1024) as f32;
        let used_mb = used / 1024.0;

        println!("Memory: {:.4} MB", used_mb);
    }
}

fn surprise_mem_increase() {
    {
        let mut properties = Properties::new();
        properties.push_binary(paho_mqtt::PropertyCode::CorrelationData, vec![10]).expect("failed to add property");
    }
    // Should be dropped! Memory isn't freed
}

Example output:

Memory: 0.5273 MB
Memory: 0.7852 MB
Memory: 0.9141 MB
Memory: 1.1719 MB
Memory: 1.3008 MB
Memory: 1.5586 MB
Memory: 1.6875 MB
Memory: 1.9453 MB
Memory: 2.0742 MB

Using: cargo 1.58.0 (f01b232bc 2022-01-19) rustc 1.58.1 (db9d1b20b 2022-01-20) Ubuntu 20.04.4 LTS

Carter12s commented 2 years ago

I can also add that by commenting out the mem::forget call in properties.rs/push I can make the mem-leak in this example stop:

    /// Adds a property to the collection.
    pub fn push(&mut self, prop: Property) -> Result<()> {
        let rc = unsafe { ffi::MQTTProperties_add(&mut self.cprops, &prop.cprop) };
        if rc == 0 {
            // mem::forget(prop);
            Ok(())
        }
        else {
            Err(rc.into())
        }
    }

However, unsure if this is resulting in a potential double free. Reading through the Paho C documentation I don't see an obvious way that Rust implementation is calling the C library incorrectly.

fpagliughi commented 2 years ago

Is that a reliable way to measure memory usage on Linux? (I'm honestly not sure). I know Linux has a lazy recapture mechanism at the OS level, and have been tricked before by it, thinking there was a leak in an app when there really was not. I'm not sure if what it reports for the process does the same thing.

A much better thing to do would be to run the app with valgrind. That would be a more definitive way to detect a memory leak. I can try it as well.

fpagliughi commented 2 years ago

But I will say, that if anything in the library is leaking memory, it would probably be the Properties!

Carter12s commented 2 years ago

Confirmed leak with valgrind:

$ cargo valgrind run --bin paho_bug
warning: /home/carter/open_source/paho.mqtt.rust/paho-mqtt-sys/Cargo.toml: unused manifest key: package.package
   Compiling paho_bug v0.1.0 (/home/carter/open_source/paho_bug)
warning: unused import: `properties`
 --> paho_bug/src/main.rs:1:17
  |
1 | use paho_mqtt::{properties, Properties};
  |                 ^^^^^^^^^^
  |
  = note: `#[warn(unused_imports)]` on by default

warning: unused variable: `i`
  --> paho_bug/src/main.rs:11:9
   |
11 |     for i in 0..1_000 {
   |         ^ help: if this is intentional, prefix it with an underscore: `_i`
   |
   = note: `#[warn(unused_variables)]` on by default

warning: unused variable: `e`
  --> paho_bug/src/main.rs:28:17
   |
28 |             Err(e) => return,
   |                 ^ help: if this is intentional, prefix it with an underscore: `_e`

warning: `paho_bug` (bin "paho_bug") generated 3 warnings
    Finished dev [unoptimized + debuginfo] target(s) in 0.56s
     Running `/home/carter/.cargo/bin/cargo-valgrind target/debug/paho_bug`
Done
       Error leaked 1000 B in 1000 blocks
        Info at malloc
             at alloc::alloc::alloc (alloc.rs:87)
             at alloc::alloc::Global::alloc_impl (status.rs:240)
             at <alloc::alloc::Global as core::alloc::Allocator>::allocate (alloc.rs:229)
             at alloc::alloc::exchange_malloc (alloc.rs:318)
             at paho_bug::surprise_mem_increase (status.rs:240)
             at paho_bug::main (main.rs:14)
             at core::ops::function::FnOnce::call_once (function.rs:227)
             at std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:122)
             at std::rt::lang_start::{{closure}} (rt.rs:145)
             at call_once<(), (dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> (function.rs:259)
             at do_call<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> (panicking.rs:492)
             at try<i32, &(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe)> (panicking.rs:456)
             at catch_unwind<&(dyn core::ops::function::Fn<(), Output=i32> + core::marker::Sync + core::panic::unwind_safe::RefUnwindSafe), i32> (panic.rs:137)
             at {closure#2} (rt.rs:128)
             at do_call<std::rt::lang_start_internal::{closure_env#2}, isize> (panicking.rs:492)
             at try<isize, std::rt::lang_start_internal::{closure_env#2}> (panicking.rs:456)
             at catch_unwind<std::rt::lang_start_internal::{closure_env#2}, isize> (panic.rs:137)
             at std::rt::lang_start_internal (rt.rs:128)
             at std::rt::lang_start (status.rs:241)
     Summary Leaked 1000 B total

Changed the above example script to run 1000 times, leaked exactly 1000 bytes.

Carter12s commented 2 years ago

Re-running with the mem::forget line in propertis.rs:push commented out valgrind reports no memory leak:

open_source (master) $ cargo valgrind run --bin paho_bug
warning: /home/carter/open_source/paho.mqtt.rust/paho-mqtt-sys/Cargo.toml: unused manifest key: package.package
   Compiling paho-mqtt v0.10.0 (/home/carter/open_source/paho.mqtt.rust)
   Compiling paho_bug v0.1.0 (/home/carter/open_source/paho_bug)
warning: unused import: `properties`
 --> paho_bug/src/main.rs:1:17
  |
1 | use paho_mqtt::{properties, Properties};
  |                 ^^^^^^^^^^
  |
  = note: `#[warn(unused_imports)]` on by default

warning: unused variable: `i`
  --> paho_bug/src/main.rs:11:9
   |
11 |     for i in 0..1_000 {
   |         ^ help: if this is intentional, prefix it with an underscore: `_i`
   |
   = note: `#[warn(unused_variables)]` on by default

warning: unused variable: `e`
  --> paho_bug/src/main.rs:28:17
   |
28 |             Err(e) => return,
   |                 ^ help: if this is intentional, prefix it with an underscore: `_e`

warning: `paho_bug` (bin "paho_bug") generated 3 warnings
    Finished dev [unoptimized + debuginfo] target(s) in 0.88s
     Running `/home/carter/.cargo/bin/cargo-valgrind target/debug/paho_bug`
Done

Again I'm not sure if this has other negative effects, but that is definitely the memory that isn't being cleaned up by the C lib.

fpagliughi commented 2 years ago

OK. Nice. Thanks. I'll mark it as a bug, and get it fixed, assuming it's in the Rust lib and not the C one. If that's the case, I'll push the bug report upstream. I know there's active work on the C lib right now, and the timing would be good (if it hasn't been fixed already).

fpagliughi commented 2 years ago

As for commenting out the mem::forget(), my initial thought is that's probably not the correct solution. Just like you are guessing, I assume I put it there because the C lib is supposed to clean up the memory.

fpagliughi commented 2 years ago

Hmm... I think I see it....

You're completely correct on both parts:

There are two possible problems with the Rust wrapper implementation:

I think the proper fix is two-fold:

  1. Add a public allocation/copy function to the Paho C library to get the memory from C. The datadup() looks good, but needs to be renamed and made public (not static).
  2. Only forget the heap memory, perhaps by nulling them out in Properties::push(), and making sure Property::drop() ignores NULL pointers.
fpagliughi commented 2 years ago

OK. The Rust CString documentation specifically says not to transfer ownership of it's raw memory to C code to free. It will result in memory leaks. So I just got lucky when I originally wrote and tested it.

The best thing would be to add some helper code to the C lib.

fpagliughi commented 2 years ago

Sorry. I'm going around in circles. Apparently the C lib is not taking ownership of the heap memory. It's making a copy using datadup().

In that case, yes; just getting rid of that mem::forget() would fix the leak.

Carter12s commented 2 years ago

Thank you for diving in, and the super quick response! Honestly, neither I nor @CodeTako know much about C <-> Rust interfacing, but we've been heavily using this lib and deeply appreciate the help!

Would love to help move this issue along? Submit a PR with a test and removal of mem:forget()?

fpagliughi commented 2 years ago

No worries. Thanks to you both. This seems trivial enough to get a quick fix in, and it may be worth pushing a point release sooner than later. I'm partially through fixing some build issues, but it may take some time to get that out. And I'm just about to release some v5 production code and I don't want a memory leak in it!

The whole properties module was an experiment to see if it would be better to manage C memory directly from Rust code. I believe the answer is, "no". The other Rust structs, like connect_options, ssl_options use a pinned cache of heap data consisting of CString values and things like that, with the C structs pointing into that memory. The CString values themselves take care of the memory management, without need for manual drop() and forget() code. But that creates self-referential data structs, and care must be made for moving values. Overall though, doing it that way seems cleaner. I should probably switch it over.

CodeTako commented 2 years ago

Thanks for the responsiveness! The memory interfacing definitely sounds tricky to handle.