CodeTogether-Inc / CodeTogether-Live

Repository for issues, documentation and more for codetogether.com and associated VS Code extension and Eclipse plugins.
Other
103 stars 12 forks source link

Read-only file system problems running codeTogether container (in kubernetes) #387

Closed nbeDPDHL closed 2 years ago

nbeDPDHL commented 2 years ago

Describe the bug Running the hub.edge.codetogether.com/releases/codetogether:latest (the container itself claims it is version "CodeTogether v2022.2.0-01333") Image, it does not start but throws a lot of errors that seem to be caused by /opt/codetogether being readonly.

The thing runs on kubernetes in our case and we don't seem to mount anything special/readonly there (I'll provide the description of the deployment below).

To Reproduce

Expected behavior The pod should start healthily.

Log kubectl describe deployment codetogether `Name: codetogether Namespace: sandsdp CreationTimestamp: Wed, 22 Jun 2022 12:47:23 +0000 Labels: app.kubernetes.io/instance=codetogether app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=codetogether app.kubernetes.io/version=2022.1.5 helm.sh/chart=codetogether-1.4.3 Annotations: deployment.kubernetes.io/revision: 1 meta.helm.sh/release-name: codetogether meta.helm.sh/release-namespace: sandsdp Selector: app.kubernetes.io/instance=codetogether,app.kubernetes.io/name=codetogether Replicas: 1 desired 1 updated 1 total 0 available 1 unavailable StrategyType: RollingUpdate MinReadySeconds: 0 RollingUpdateStrategy: 25% max unavailable, 25% max surge Pod Template: Labels: app.kubernetes.io/instance=codetogether app.kubernetes.io/name=codetogether Service Account: codetogether Containers: codetogether: Image: aksacr9099.azurecr.io/sandpit/codetogether:latest Port: 1080/TCP Host Port: 0/TCP Limits: cpu: 2 memory: 4Gi Requests: cpu: 2 memory: 4Gi Liveness: http-get http://:http/clients/ delay=60s timeout=15s period=60s #success=1 #failure=1 Readiness: http-get http://:http/clients/ delay=60s timeout=15s period=60s #success=1 #failure=1 Environment: CT_SERVER_URL: https://codetogether.local CT_TRUST_ALL_CERTS: true CT_LOCATOR: none CT_PROMETHEUS_ENABLED: false CT_AV_ENABLED: false CT_TIME_ZONE: Europe/Berlin CT_LICENSEE: <set to the key 'licensee' in secret 'codetogether-license'> Optional: false CT_MAXCONNECTIONS: <set to the key 'max_connections' in secret 'codetogether-license'> Optional: false CT_EXPIRATION: <set to the key 'expiration' in secret 'codetogether-license'> Optional: false CT_SIGNATURE: <set to the key 'signature' in secret 'codetogether-license'> Optional: false Mounts: /var/cache/nginx from nginx-cache (rw) /var/log/codetogether from var-log-codetogether (rw) /var/log/nginx from var-log-nginx (rw) /var/run from pid (rw) Volumes: nginx-cache: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium:
SizeLimit: pid: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium:
SizeLimit: var-log-codetogether: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium:
SizeLimit: var-log-nginx: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium:
SizeLimit: Conditions: Type Status Reason

Progressing True NewReplicaSetAvailable Available False MinimumReplicasUnavailable OldReplicaSets: NewReplicaSet: codetogether-5584856fd9 (1/1 replicas created) Events: `

kubectl describe pod codetogether-5584856fd9-qp8r9 `Name: codetogether-5584856fd9-qp8r9 Namespace: sandsdp Priority: 0 Node: aks-sandsdp-37874616-vmss000000/10.156.146.133 Start Time: Sat, 02 Jul 2022 07:18:21 +0000 Labels: app.kubernetes.io/instance=codetogether app.kubernetes.io/name=codetogether pod-template-hash=5584856fd9 Annotations: cni.projectcalico.org/containerID: 4718f8aecf97f3c984b154d29febbbcd3ba356bc68fe17924b48150800135a53 cni.projectcalico.org/podIP: 100.64.6.79/32 cni.projectcalico.org/podIPs: 100.64.6.79/32 Status: Running IP: 100.64.6.79 IPs: IP: 100.64.6.79 Controlled By: ReplicaSet/codetogether-5584856fd9 Containers: codetogether: Container ID: containerd://4363713ec0d251ae227436834d06e6030c7a25d4740b61a6e9072a31f9b83041 Image: aksacr9099.azurecr.io/sandpit/codetogether:latest Image ID: aksacr9099.azurecr.io/sandpit/codetogether@sha256:ed70e5d0ebc8c940373e85de0d44272ddf72b38ad40f486e026e7fafb73a2912 Port: 1080/TCP Host Port: 0/TCP State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 1 Started: Wed, 06 Jul 2022 13:14:25 +0000 Finished: Wed, 06 Jul 2022 13:16:56 +0000 Ready: False Restart Count: 806 Limits: cpu: 2 memory: 4Gi Requests: cpu: 2 memory: 4Gi Liveness: http-get http://:http/clients/ delay=60s timeout=15s period=60s #success=1 #failure=1 Readiness: http-get http://:http/clients/ delay=60s timeout=15s period=60s #success=1 #failure=1 Environment: CT_SERVER_URL: https://codetogether.local CT_TRUST_ALL_CERTS: true CT_LOCATOR: none CT_PROMETHEUS_ENABLED: false CT_AV_ENABLED: false CT_TIME_ZONE: Europe/Berlin CT_LICENSEE: <set to the key 'licensee' in secret 'codetogether-license'> Optional: false CT_MAXCONNECTIONS: <set to the key 'max_connections' in secret 'codetogether-license'> Optional: false CT_EXPIRATION: <set to the key 'expiration' in secret 'codetogether-license'> Optional: false CT_SIGNATURE: <set to the key 'signature' in secret 'codetogether-license'> Optional: false Mounts: /var/cache/nginx from nginx-cache (rw) /var/log/codetogether from var-log-codetogether (rw) /var/log/nginx from var-log-nginx (rw) /var/run from pid (rw) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: nginx-cache: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium:
SizeLimit: pid: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium:
SizeLimit: var-log-codetogether: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium:
SizeLimit: var-log-nginx: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium:
SizeLimit: QoS Class: Guaranteed Node-Selectors: agentpool=sandsdp Tolerations: node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message


Warning BackOff 3m23s (x19961 over 4d5h) kubelet Back-off restarting failed container`

kubectl logs codetogether-5584856fd9-qp8r9 `rm: cannot remove '/opt/codetogether/localtime': Read-only file system ln: failed to create symbolic link '/opt/codetogether/localtime': File exists

Setting customized time zone >> Europe/Berlin 2022-07-06 13:22 [INFO] CodeTogether v2022.2.0-01333 2022-07-06 13:22 [INFO] Use of this software is governed under the CodeTogether End User License Agreement.

2022-07-06 13:22 [INFO] Disabling A/V communication channels in CodeTogether sessions.

/opt/codetogether/start-codetogether: line 149: /opt/codetogether/runtime.log: Read-only file system 2022-07-06 13:22 [INFO] Starting CodeTogether Web server ... 2022-07-06 13:22 [INFO] This Edge server's metrics dashboard can be accessed at https://codetogether.local/dashboard with user: ctuser using a temporary password: rKXQmiH9 Use CT_DASHBOARD_USER and CT_DASHBOARD_PASSWORD to set explicit values for this deployment.

┌───────────────────────────────────────────────────────────────┐ │ npm update check failed │ │ Try running with sudo or get access │ │ to the local update config store via │ │ sudo chown -R $USER:$(id -gn $USER) /opt/codetogether/.config │ └───────────────────────────────────────────────────────────────┘ /opt/codetogether/start-codetogether: line 433: /opt/codetogether/licensing.properties: Read-only file system /opt/codetogether/.nvm/versions/node/v12.22.12/lib/node_modules/forever/node_modules/configstore/index.js:65 throw error; ^

Error: EROFS: read-only file system, mkdir '/opt/codetogether/.forever' at Object.mkdirSync (fs.js:921:3) at make (/opt/codetogether/.nvm/versions/node/v12.22.12/lib/node_modules/forever/node_modules/make-dir/index.js:61:12) at Function.module.exports.sync (/opt/codetogether/.nvm/versions/node/v12.22.12/lib/node_modules/forever/node_modules/make-dir/index.js:84:9) at Configstore.set all [as all] (/opt/codetogether/.nvm/versions/node/v12.22.12/lib/node_modules/forever/node_modules/configstore/index.js:56:12) at Configstore.set (/opt/codetogether/.nvm/versions/node/v12.22.12/lib/node_modules/forever/node_modules/configstore/index.js:88:12) at Object.forever.load (/opt/codetogether/.nvm/versions/node/v12.22.12/lib/node_modules/forever/lib/forever.js:327:18) at Object. (/opt/codetogether/.nvm/versions/node/v12.22.12/lib/node_modules/forever/lib/forever.js:374:9) at Module._compile (internal/modules/cjs/loader.js:999:30) at Object.Module._extensions..js (internal/modules/cjs/loader.js:1027:10) at Module.load (internal/modules/cjs/loader.js:863:32) { errno: -30, syscall: 'mkdir', code: 'EROFS', path: '/opt/codetogether/.forever' } checkdir error: cannot create codetogether-onpremises Read-only file system unable to process codetogether-onpremises/lib/codetogether-onpremises-2022.2.0-01333.jar. /opt/codetogether/start-codetogether: line 470: cd: codetogether-onpremises/lib: No such file or directory /opt/codetogether/start-codetogether: line 471: onpremises.txt: Read-only file system /opt/codetogether/start-codetogether: line 474: onpremises-trustall.txt: Read-only file system rm: cannot remove 'onpremises-trustall.txt': No such file or directory rm: cannot remove 'onpremises.txt': No such file or directory /opt/codetogether/start-codetogether: line 490: /var/www/html/clients/updatePlugins.xml: Read-only file system rm: cannot remove '/opt/codetogether/updatePlugins.tpl': Read-only file system mkdir: cannot create directory 'extension': Read-only file system /opt/codetogether/start-codetogether: line 497: extension/onpremises.txt: No such file or directory /opt/codetogether/start-codetogether: line 500: extension/onpremises-trustall.txt: No such file or directory error: cannot create content.xml Read-only file system rm: cannot remove 'content.jar': Read-only file system sed: can't read content.xml: No such file or directory sed: can't read content.xml: No such file or directory rm: cannot remove 'content.xml.xz': Read-only file system xz: content.xml: No such file or directory

2022-07-06 13:22 [INFO] Web Server configuration validated. nginx: the configuration file /etc/nginx/nginx.conf syntax is ok nginx: configuration file /etc/nginx/nginx.conf test is successful 2022-07-06 13:22 [INFO] Web Server started successfully!

2022-07-06 13:22 [INFO] Starting CodeTogether...

Trusting ALL Certs=true (TLS Reject Unauthorized=0)

Clients URL is present: https://codetogether.local

Updating IntelliJ client... Archive: codetogether-onpremises-2022.2.0-01333.zip zip warning: codetogether-onpremises-.jar not found or empty zip warning: name not matched: onpremises.txt zip warning: name not matched: onpremises-trustall.txt zip warning: codetogether-onpremises-.zip not found or empty zip warning: name not matched: codetogether-onpremises/lib/codetogether-onpremises-*.jar

Updating VS-Code client... zip warning: name not matched: extension/onpremises.txt zip warning: name not matched: extension/onpremises-trustall.txt

Updating Eclipse client... Archive: content.jar

zip error: Nothing to do! (content.jar)`

brianvfernandes commented 2 years ago

Support for the read-only file system was added in build 2022.2.1-1345, you are using build 2022.2.0-1333 which does not contain the fix. Additionally, please ensure that you are using the latest version of our Helm chart (version 1.4.6) - the updated chart will create the mount points required for the container to work with a read-only file system, so this is essential. Given the changes in the CodeTogether image structure to support read-only mode, we would also recommend removing the custom temporary volumes you have created. Please let us know if you're able to get CodeTogether up and running with these changes.

nbeDPDHL commented 2 years ago

Thanks a lot. The new version solved our problem and we no longer have any issues with read-only filesystems in the container.