CulverLab / sparcd

https://wildcatresearch.arizona.edu/
GNU General Public License v3.0
4 stars 7 forks source link

Problem: Minio service is unresponsive after server reboot #70

Open smalusa opened 2 years ago

smalusa commented 2 years ago

Describe the bug After department update of servers, MINIO did not reconnect -

To Reproduce Monthly server update at SNRE

  1. Login page of SPARCd program and input login information
  2. Login
  3. Error is produced "Invalid URL, Username or Password"
  4. This is alleviated when Julian SSH'd the server
  5. Not sure if it will repeat with each update service

OS and Environment (please complete the following information):

Chris-Schnaufer commented 1 year ago

This appears to be caused by a tmux session needing to be established on the server and left running.

Paraphrased from a Slack conversation with @julianpistorius:

The workaround is to ssh into the server as the minio user, then create a tmux session, disconnect from the tmux session, and then log out. That automatically starts the containers and keeps them running.

Note that you have to be on campus or on the VPN to ssh in

Chris-Schnaufer commented 1 year ago

@julianpistorius Is there a way to automate this?

julianpistorius commented 1 year ago

@Chris-Schnaufer I think we need to do this:

loginctl enable-linger minio

From: https://wiki.archlinux.org/title/Systemd/User#Automatic_start-up_of_systemd_user_instances

julianpistorius commented 1 year ago

I've enabled this. Let's see if this solves the problem by seeing if the services come back online after the maintenance on Thursday.

julianpistorius commented 1 year ago

It seems to be up. Just not sure if the server was restarted or not. @Chris-Schnaufer can you log in and check uptime?