kasmtech / workspaces-issues

19 stars 4 forks source link

[Bug] - Containers not starting after system update #385

Open mciantar opened 1 year ago

mciantar commented 1 year ago

Describe the bug After running apt upgrade and rebooted the systems, containers won't start with error:

An Unexpected Error occurred creating the Kasm. Please contact an Administrator

My understanding is that for some reason, kasm_guac is not starting. I can see the following logs:

kasm_guac     |
kasm_guac     | > GClient@1.1.2 start
kasm_guac     | > NODE_ENV=production node app.js
kasm_guac     |
kasm_guac     | /gclient/logger.js:61
kasm_guac     |   errorEventName: config.kasmguac.logging.errorEventName,
kasm_guac     |                          ^
kasm_guac     |
kasm_guac     | TypeError: Cannot read properties of undefined (reading 'kasmguac')
kasm_guac     |     at Object.<anonymous> (/gclient/logger.js:61:26)
kasm_guac     |     at Module._compile (node:internal/modules/cjs/loader:1191:14)
kasm_guac     |     at Object.Module._extensions..js (node:internal/modules/cjs/loader:1245:10)
kasm_guac     |     at Module.load (node:internal/modules/cjs/loader:1069:32)
kasm_guac     |     at Function.Module._load (node:internal/modules/cjs/loader:904:12)
kasm_guac     |     at Module.require (node:internal/modules/cjs/loader:1093:19)
kasm_guac     |     at require (node:internal/modules/cjs/helpers:108:18)
kasm_guac     |     at Object.<anonymous> (/gclient/app.js:12:16)
kasm_guac     |     at Module._compile (node:internal/modules/cjs/loader:1191:14)
kasm_guac     |     at Object.Module._extensions..js (node:internal/modules/cjs/loader:1245:10)
kasm_guac exited with code 1
2023-05-14 22:20:23,978 [DEBUG] client_api_server: Requesting Hello for Server(fd5432bd-c5b0-4815-aea2-95168cce25cc) via URL: (https://proxy:443/agent/api/v1/create_container/)
2023-05-14 22:20:36,253 [ERROR] client_api_server: Error during Create request for Server(fd5432bd-c5b0-4815-aea2-95168cce25cc) : (Exception creating Kasm: Traceback (most recent call last):
  File "__init__.py", line 538, in post
  File "provision.py", line 1199, in provision
  File "provision.py", line 1351, in generate_nginx_config
Exception: Nginx failed to reload after generating config for container (67a668aa3636f5e07ad473624f5f644f64f2e1f5dea070ce264ee370b35639bd)
)
2023-05-14 22:20:36,254 [DEBUG] client_api_server: Function (provider_manager.create_kasm_from_slot) executed in (12.282997131347656) seconds
2023-05-14 22:20:36,254 [DEBUG] client_api_server: Function (provider_manager.get_container) executed in (12.308872938156128) seconds
2023-05-14 22:20:36,255 [ERROR] client_api_server: An Unexpected Error occurred creating the Kasm. Please contact an Administrator : Error during Create request for Server(fd5432bd-c5b0-4815-aea2-95168cce25cc) : (Exception creating Kasm: Traceback (most recent call last):
  File "__init__.py", line 538, in post
  File "provision.py", line 1199, in provision
  File "provision.py", line 1351, in generate_nginx_config
Exception: Nginx failed to reload after generating config for container (67a668aa3636f5e07ad473624f5f644f64f2e1f5dea070ce264ee370b35639bd)
)
2023-05-14 22:20:36,256 [DEBUG] client_api_server: Function (client_api.request_kasm) executed in (12.330260515213013) seconds

To Reproduce In my case, all I changed was to create a new workspace for vs-code, and ran apt update/upgrade and rebooted.

Expected behavior Workspaces should start

Workspaces Version Version 1.13

Workspaces Installation Method Single Server

Client Browser (please complete the following information):

Workspace Server Information (please provide the output of the following commands):

JackyF737 commented 1 year ago

same problem here

silmarine commented 2 months ago

I am also having this issue on Kasm 1.15.0 installed on debian 12. Started right after I upgraded the host packages but restoring to a time before the upgrade doesn't fix it. Kasm is current unusable for me.

mmcclaskey commented 2 months ago

Based on @mciantar's logs, the kasm_guac container is not starting, and since it does not start, nginx in kasm_proxy can't start because it can't resolve the container's hostname. Based on the logs, it seems that the kasm_guac config file may be corrupt or missing. Can you please provide the contents of this file?

/opt/kasm/current/conf/app/kasmguac.app.config.yaml

silmarine commented 2 months ago

For me it's empty. I used the cat command and it had no output at all.

mmcclaskey commented 2 months ago

We have a KB article that covers getting things restored if this happens to you. The update and reboot was likely not the cause. The most likely cause, but not only, is that the system disk was full some time while Kasm was running. Kasm continually writes to this file and if the disk is full the write fails and you end up with a blank file. Kasm continues to work normally because everything in that file is in memory, but on a reboot it attempts to read the values from that file and thats when it fails. So the reboot did not cause the issue, it merely made you aware of the issue that was there waiting.

https://kasmweb.atlassian.net/servicedesk/customer/portal/3/article/8126468