Open ericfranz opened 5 years ago
This is also a problem client side: https://github.com/OSC/ood-shell/blob/2a1af635dbcfb64221a013f4500c0d1d6cd9301e/public/javascripts/ood_shell.1.js#L15
Instead, when the websocket is closed, a warning message should appear above the shell app - not printed to the terminal - that claims "the websocket connection failed, trying to reconnect"
And for the shell app only we could consider setting passenger_min_instances to 1:
Please note that this option does not pre-start application processes during Nginx startup. It just makes sure that when the application is first accessed:
- at least the given number of processes will be spawned.
- the given number of processes will be kept around even when processes are being idle cleaned.
Would this ensure that the for any user running a shell app that PUNs and the shell app stick around, even after users go away from the app? Forever?
We might want to consider when we would want to ensure shell apps actually get shut down.
reviewed, is a good idea but needs looked at again and updated
Goal: If you temporarily lose internet connection, close your laptop to walk to a meeting, or switch wifi network, the shell app browser client will attempt to reconnect to the server, instead of your work being lost. https://trello.com/c/wUXyxqFb/60-shell-reconnect
There are several possible issues that can result in the ssh session dying after 1 a minute or two disconnected from the network.
The problem with the app as it is written is that the ssh session https://github.com/OSC/ood-shell/blob/2a1af635dbcfb64221a013f4500c0d1d6cd9301e/app.js#L67-L73 is coupled with the ws connection object https://github.com/OSC/ood-shell/blob/2a1af635dbcfb64221a013f4500c0d1d6cd9301e/app.js#L47 or "client". So when a connection is closed due to temporary loss of network access (close laptop, switch networks, etc.) the terminal session is also closed https://github.com/OSC/ood-shell/blob/2a1af635dbcfb64221a013f4500c0d1d6cd9301e/app.js#L97-L100
Instead, terminal sessions should be stored in a hash like object with a key being a unique identifier that is shared with the client so that the client can use that identifier when it tries to initiate a new ws connection. Then the new ws connection would have three possiblities:
With this work we still have the problem of the web server itself being killed by Passenger after the
passenger_pool_idle_time
is reached (5 minutes). Enabling sites to increase this would address the shell problem but introduce other problems as it is an option that affects all apps.In the PUN configs, it seems that we are using default values for these:
passenger_max_pool_size
- default is 6. I haven't found documentation on what happens when you try to launch an app past thepassenger_max_pool_size
- does it kill others or does it fail to launch? We should investigate this - maybe a separate issue?passenger_max_instances_per_app
(is global and applies to all instances) - default is 0. it would seem we would want 1 for this; and if this is 1 then we don't need sticky sessions enabled above┆Issue is synchronized with this Asana task by Unito