mbround18 / valheim-docker

Valheim Docker powered by Odin. The Valheim dedicated gameserver manager which is designed with resiliency in mind by providing automatic updates, world backup support, and a user friendly cli interface.
https://hub.docker.com/r/mbround18/valheim
BSD 3-Clause "New" or "Revised" License
762 stars 81 forks source link

[Bug] shutdown.rs blocks infinitely if process can't be stopped #337

Open anonymus19941 opened 3 years ago

anonymus19941 commented 3 years ago

I accidentially ran the valheim server as root and didn't realize it. The auto update script, run by cron as user steam, tries to stop the server with odin stop. This calls the function blocking_shutdown in shutdown.rs (I think). This function tries to send an interrupt signal to the server process and then blocks until it doesn't exist anymore. In my case, odin run as steam user couldn't stop the server run as root, so it blocked infinitely and created new processes each time the cron job ran.

This shouldn't happen normaly, as the server usually doesn't run as root, but it's still not really nice and should be fixed, in my opinion.

mbround18 commented 3 years ago

Ohh good find! I will take a look into this, Ill see about adding a timeout or looking to see if I can get the processes user to throw an error situation on.

mbround18 commented 3 years ago

Alright, I think I understand how I want to tackle this, from an acceptance criteria:

mbround18 commented 3 years ago

I have a temp fix for this I will publish image

mbround18 commented 3 years ago

The error should be available now :) been a bit busy with work/school to tackle the underlying issue and will leave this ticket open but it will halt execution if you are running as root.

anonymus19941 commented 3 years ago

Kinda forgot about this ... Thanks for the fix. This is definitively helpful.

To the underlying issue: I'm not sure if this is helpful, because I don't know how you would do this, but I tried to fix it (or at least I have started to) according to your message (I don't really know which process you'd want to return, so I left it empty for now). Feel free to use or extend or discard it (I didn't want to create a PR, so here's a patch instead).

Index: src/odin/server/update.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/odin/server/update.rs b/src/odin/server/update.rs
--- a/src/odin/server/update.rs (revision 22ac603b10ef3da8dd025c8d85279cbcd90ff061)
+++ b/src/odin/server/update.rs (date 1621729044090)
@@ -65,7 +65,10 @@
   // Shutdown the server if it's running
   let server_was_running = server::is_running();
   if server_was_running {
-    server::blocking_shutdown();
+    if let Err(e) = server::blocking_shutdown() {
+      error!("Failed to stop server: {}", e);
+      exit(1);
+    }
   }

   // Update the installation
Index: src/odin/server/shutdown.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/odin/server/shutdown.rs b/src/odin/server/shutdown.rs
--- a/src/odin/server/shutdown.rs   (revision 22ac603b10ef3da8dd025c8d85279cbcd90ff061)
+++ b/src/odin/server/shutdown.rs   (date 1621729085757)
@@ -4,10 +4,12 @@
 use std::{thread, time::Duration};

 use crate::constants;
+use crate::errors::TimeoutError;

-pub fn blocking_shutdown() {
+pub fn blocking_shutdown() -> Result<(), TimeoutError> {
   send_shutdown_signal();
-  wait_for_exit();
+
+  wait_for_exit()
 }

 pub fn send_shutdown_signal() {
@@ -32,9 +34,10 @@
   }
 }

-fn wait_for_exit() {
+fn wait_for_exit() -> Result<(), TimeoutError> {
   info!("Waiting for server to completely shutdown...");
   let mut system = System::new();
+  let mut waiting_for_seconds = 0;
   loop {
     system.refresh_all();
     let processes = system.get_process_by_name(constants::VALHEIM_EXECUTABLE_NAME);
@@ -43,7 +46,14 @@
     } else {
       // Delay to keep down CPU usage
       thread::sleep(Duration::from_secs(1));
+
+      waiting_for_seconds += 1;
+      if waiting_for_seconds > 300 {
+        return Err(TimeoutError {});
+      }
     }
   }
-  info!("Server has been shutdown successfully!")
+  info!("Server has been shutdown successfully!");
+
+  Ok(())
 }
Index: src/odin/commands/stop.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/odin/commands/stop.rs b/src/odin/commands/stop.rs
--- a/src/odin/commands/stop.rs (revision 22ac603b10ef3da8dd025c8d85279cbcd90ff061)
+++ b/src/odin/commands/stop.rs (date 1621729044103)
@@ -15,6 +15,9 @@
       error!("Failed to find server executable!");
       exit(1);
     }
-    server::blocking_shutdown();
+    if let Err(e) = server::blocking_shutdown() {
+      error!("Failed to stop server: {}", e);
+      exit(1);
+    }
   }
 }
Index: src/odin/errors/mod.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/odin/errors/mod.rs b/src/odin/errors/mod.rs
--- a/src/odin/errors/mod.rs    (revision 22ac603b10ef3da8dd025c8d85279cbcd90ff061)
+++ b/src/odin/errors/mod.rs    (date 1621729044109)
@@ -13,3 +13,14 @@
     write!(f, "VariantNotFound: {}", &self.v)
   }
 }
+
+#[derive(Debug)]
+pub struct TimeoutError {}
+
+impl error::Error for TimeoutError {}
+
+impl Display for TimeoutError {
+  fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+    write!(f, "Timeout")
+  }
+}
mbround18 commented 3 years ago

Hey @anonymus19941 thanks for your work on this, I am looking into some shutdown work atm and trying to find a good way to handle it. My goal is to survey the running programs -> pass to a function to process them and check permissions -> if has permissions then shut down the process and ensure its shutdown with a timeout trap similar to how you have outlined :)

fclante commented 7 months ago

Is this still an active issue?

mbround18 commented 5 months ago

@fclante not really since its forced to run as steam in the dockerfile additionally there have been many improvements since it was an issue. I think its safe to close but i probably should add a root check to the binary just in case its ran elsewhere.