lucab / libsystemd-rs

A pure-Rust client library to work with systemd
https://crates.io/crates/libsystemd
Other
105 stars 19 forks source link

Runaway filedescriptors #141

Closed issackelly closed 1 year ago

issackelly commented 1 year ago

I have the following patch I just applied to switch to systemd crate. I was indifferent to the two until I noticed that my file descriptor count was going up on every tick here. I'm sure I have done something foolish in my implementation since nobody else has reported it but I also wanted to share in case it was useful to you or someone else. Happy to provide any further details if it's useful.

I did like that your crate expressed the enums and didn't require libsystemd, it made my deployment a little less complicated and the code easier to read. Thank you for your work here!

@@ -21,10 +21,10 @@ use axum_prometheus::{
-use libsystemd::daemon::{notify, NotifyState};
+use systemd::daemon::{notify, STATE_WATCHDOG, STATE_WATCHDOG_USEC};

+    let wd_usec_after_boot: String = format!("{}", wd_usec_after_boot);
     let wd_tick = Duration::from_secs((args.watchdog_timeout_sec / 2).try_into().unwrap());
     tasks.spawn({
-        notify(false, &[NotifyState::WatchdogUsec(wd_usec_after_boot)]).unwrap();
+        notify(
+            false,
+            [(STATE_WATCHDOG_USEC, wd_usec_after_boot.as_str())].iter(),
+        )
+        .expect("Notify Failed To Set Timer");
         let mut wd_interval = tokio::time::interval(wd_tick);
         tracing::debug!("Systemd Watchdog Configured");

@@ -454,7 +458,7 @@ pub fn app(
                 tokio::select! {
                     _ = wd_interval.tick() => {
                         tracing::debug!("Systemd Watchdog Reset");
-                        notify(false, &[NotifyState::Watchdog]).unwrap();
+                        notify(false, [(STATE_WATCHDOG, "1")].iter()).unwrap();
                     },
                 };
lucab commented 1 year ago

Thanks for the report. Yes we are indeed leaking this FD: https://github.com/lucab/libsystemd-rs/blob/7617bf72840c04d2805f3391aac09e5ca305fda9/src/daemon.rs#L101

The proper fix would be to enhance the nix crate so that the socket() method directly returns an OwnedFd instead of a plain RawFd.

A quicker fix in this crate instead would be to manually call close() on the FD before returning (this will fix your case, but may still leak a descriptor on errors), or to immediately turn the socket FD into a OwnedFd so that it is automatically closed in all cases (better, but need to bump MSRV).

If you want to provide some patch to tackle this, I'll be happy to review.

swsnr commented 1 year ago

Why do we even use socket here? As far as I can see this line creates a regular datagram socket; couldn't we use UnixDatagram from std here, and then only convert to raw FD when required?

swsnr commented 1 year ago

In any case I think we should generally make use of BorrowedFd whereever we're currently using raw FDs, but that'd be an MSRV bump to 1.63 as far as I believe, and that's quite a jump from our current MSRV. But then again 1.56 is about two years old now, and even 1.63 is more than six months ago.

I'd be in favour of this bump and the corresponding change; if you agree @lucab I'd make a corresponding PR.

lucab commented 1 year ago

@swsnr I think I switched some other parts of this function to use nix and also got rid of UnixDatagram at the same time, but in retrospect the latter was clearly a bad decision. I'm fine with both bumping the MSRV and with re-introducing std type (assuming that works with the rest of nix methods).

issackelly commented 1 year ago

Thank you for the quick fix!