sagemathinc / cocalc-docker

DEPRECATED (was -- Docker setup for running CoCalc as downloadable software on your own computer)
https://cocalc.com
Other
398 stars 103 forks source link

local_hub starts multiple times and breaks project response #106

Closed HighwayStar closed 2 years ago

HighwayStar commented 3 years ago

We have cocalc docker instance and faced strange issue on some projects - local_hub process started many times and project became unaccessible until restart (showing Loading... on any notebook), but after project restart it happens again in about couple of minutes after project start.

here is forever list output in terminal of such project

~$ forever list
info:    Forever processes running
data:        uid  command script                                   forever pid   id logfile                                                                     uptime        
data:    [0] T1VV coffee  /cocalc/src/smc-project/local_hub.coffee 11046   11068    /projects/28943c77-d1bc-405c-9e40-cf2b82b9a645/.smc/local_hub/local_hub.log 0:0:20:4.871  
data:    [1] n8Kj coffee  /cocalc/src/smc-project/local_hub.coffee 11142   11153    /projects/28943c77-d1bc-405c-9e40-cf2b82b9a645/.smc/local_hub/local_hub.log 0:0:19:39.155 
data:    [2] H0AI coffee  /cocalc/src/smc-project/local_hub.coffee 11196   11207    /projects/28943c77-d1bc-405c-9e40-cf2b82b9a645/.smc/local_hub/local_hub.log 0:0:19:8.101  
data:    [3] 1UHy coffee  /cocalc/src/smc-project/local_hub.coffee 11265   11276    /projects/28943c77-d1bc-405c-9e40-cf2b82b9a645/.smc/local_hub/local_hub.log 0:0:18:38.428

it shows local_hub started 4 times

HighwayStar commented 3 years ago

I've added some debug output here https://github.com/sagemathinc/cocalc/blob/master/src/smc-project/bin/smc-local-hub#L24

it looks like smc-local-hub called multiple times, and forever command does not check if process already running or not

williamstein commented 3 years ago

Can you try to fix this in your environment? You can directly edit

/usr/bin/smc-local-hub

as root (sudo docker exec -it cocalc bash). One idea might be to check if there is a very recent pid file in ~/.smc/local_hub, and if so exit (thus letting the previous attempt finish starting up properly).

HighwayStar commented 3 years ago

I've added local workaround like this

diff --git a/src/smc-project/bin/smc-local-hub b/src/smc-project/bin/smc-local-hub
index cd16f3950..2c38792fa 100755
--- a/src/smc-project/bin/smc-local-hub
+++ b/src/smc-project/bin/smc-local-hub
@@ -5,6 +5,10 @@ import os, sys
 if not 'SMC' in os.environ:
     os.environ['SMC'] = os.path.join(os.environ['HOME'], '.smc')

+pidfile = os.path.join(os.environ['SMC'], 'local_hub', 'local_hub.pid' )
+if sys.argv[1] == 'start' and os.path.isfile(pidfile):
+    sys.exit(0)
+
 data = os.path.join(os.environ['SMC'], 'local_hub')
 if not os.path.exists(data):
     os.makedirs(data)

now testing if there any side effects.

williamstein commented 3 years ago

Thanks! PR welcome. I would check the timestamp on the pid file in addition to just checking existence or better check that there is a process with the claimed pid.

What you wrote above might (?) in some edge case make the project unstartable, since maybe it crashed before and left a pid file around.

On Fri, Oct 9, 2020 at 7:47 AM HighwayStar notifications@github.com wrote:

I've added local workaround like this

diff --git a/src/smc-project/bin/smc-local-hub b/src/smc-project/bin/smc-local-hub index cd16f3950..2c38792fa 100755 --- a/src/smc-project/bin/smc-local-hub +++ b/src/smc-project/bin/smc-local-hub @@ -5,6 +5,10 @@ import os, sys if not 'SMC' in os.environ: os.environ['SMC'] = os.path.join(os.environ['HOME'], '.smc')

+pidfile = os.path.join(os.environ['SMC'], 'local_hub', 'local_hub.pid' ) +if sys.argv[1] == 'start' and os.path.isfile(pidfile):

  • sys.exit(0)
  • data = os.path.join(os.environ['SMC'], 'local_hub') if not os.path.exists(data): os.makedirs(data)

now testing if there any side effects.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/sagemathinc/cocalc-docker/issues/106#issuecomment-706224462, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJXS5UJJVF2ELFH7RFANRLSJ4O7FANCNFSM4SKBXDYA .

-- William (http://wstein.org)

williamstein commented 2 years ago

The latest version of cocalc completely deletes all forever and start-stop-daemon code.