Geal / rust-syslog

Send syslog messages from Rust
MIT License
109 stars 55 forks source link

syslog is not prepared to deal with ENOBUFS #79

Open he32 opened 9 months ago

he32 commented 9 months ago

On NetBSD, you can get ENOBUFS if you write "too quickly" to /dev/log (unix domain socket). However, the rust syslog code has no provision for handling that error or for that matter, the need to re-connect (as already noted in issue #21). Looking at the NetBSD syslog(3) implementation, I find code which looks like this (in C, of course):

        /*
         * If the send() failed, there are two likely scenarios:
         *  1) syslogd was restarted
         *  2) /dev/log is out of socket buffer space
         * We attempt to reconnect to /dev/log to take care of
         * case #1 and keep send()ing data to cover case #2
         * to give syslogd a chance to empty its socket buffer.
         */
        for (tries = 0; tries < MAXTRIES; tries++) {
                if (send(data->log_file, tbuf, cnt, 0) != -1)
                        break;
                if (errno != ENOBUFS) {
                        disconnectlog_r(data);
                        connectlog_r(data);
                } else
                        (void)usleep(1);
        }

But there is no corresponding code in this implementation. This causes the ENOBUFS error to be "passed upwards" to the user, which in my case is routinator from NLnetLabs (available at https://github.com/NLnetLabs/routinator), and ... I have experienced that this software will exit "mysteriously" with "Logging to syslog failed: Format. Exiting." printed to stderr, due to the missing handling of ENOBUFS (I have the syscall trace to prove that). Let me suggest that some improvement of the robustness along the lines above would be beneficial. My ability to suggest how this should be done in rust is sadly not there...

he32 commented 9 months ago

Here is an untested un-compiled sketch from a rust novice:

@@ -2,6 +2,7 @@ use std::collections::HashMap;
 use std::fmt::Display;
 use std::io::Write;
 use time;
+use libc::{usleep, ENOBUFS};

 use errors::*;
 use facility::Facility;
@@ -66,6 +67,33 @@ pub struct Formatter3164 {
     pub pid: u32,
 }

+let MAXTRIES = 10;
+
+// Re-try on ENOBUFS error, re-connect & re-try on other errors
+// Ignores errors & drops message if the retry count is exceeded.
+macro_rules! do_write {
+    ($dst:expr, $($arg:tt)*) => {
+        for tries in 0 .. MAXTRIES {
+            match write!($dst, $($arg)*) {
+                Err(e) => match e.raw_os_error() {
+                    libc::ENOBUFS => {
+                        usleep(1); // Wait a bit/re-schedule & re-try
+                    }
+                    _ => {
+                        // Re-connect to backend (unix socket, tcp port, etc.)
+                        // syslogd could be re-started etc.
+                    }
+                }
+                Ok(r) => return Ok(r)
+            }
+        }
+        // The C version has code to print to the console if we exhaust
+        // the number of retries.  Not sure we want to do that here...
+        // Instead, we drop the message, and return success with 0 bytes written.
+        return Ok(0);
+    }
+}
+
 impl<T: Display> LogFormat<T> for Formatter3164 {
     fn format<W: Write>(&self, w: &mut W, severity: Severity, message: T) -> Result<()> {
         let format =
@@ -73,7 +101,7 @@ impl<T: Display> LogFormat<T> for Format
                 .unwrap();

         if let Some(ref hostname) = self.hostname {
-            write!(
+            do_write!(
                 w,
                 "<{}>{} {} {}[{}]: {}",
                 encode_priority(severity, self.facility),
@@ -87,7 +115,7 @@ impl<T: Display> LogFormat<T> for Format
             )
             .chain_err(|| ErrorKind::Format)
         } else {
-            write!(
+            do_write!(
                 w,
                 "<{}>{} {}[{}]: {}",
                 encode_priority(severity, self.facility),

I could not immediately figure out what the "re-connect" code should look like, how to get at the parameters that were given in the initial initialization invocation...