Closed RobinL closed 6 years ago
same issue for user's rstudio
k describe po hbutchermoj-rstudio-rstu-79fbcc995f-rfgzx -n user-hbutchermoj
Name: hbutchermoj-rstudio-rstu-79fbcc995f-rfgzx
Namespace: user-hbutchermoj
Node: ip-192-168-10-66.eu-west-1.compute.internal/192.168.10.66
Start Time: Wed, 12 Sep 2018 10:27:13 +0100
Labels: app=rstudio
pod-template-hash=3596775519
Annotations: iam.amazonaws.com/role=alpha_user_hbutchermoj
Status: Pending
IP:
Controlled By: ReplicaSet/hbutchermoj-rstudio-rstu-79fbcc995f
Containers:
rstudio-auth-proxy:
Container ID:
Image: quay.io/mojanalytics/rstudio-auth-proxy:v1.4.3
Image ID:
Port: 3000/TCP
Host Port: 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
cpu: 100m
memory: 128Mi
Requests:
cpu: 25m
memory: 64Mi
Readiness: http-get http://:http/healthz delay=5s timeout=1s period=5s #success=1 #failure=3
Environment:
USER: hbutchermoj
APP_PROTOCOL: https
APP_HOST: <set to the key 'app_host' in secret 'hbutchermoj-rstudio-rstu'> Optional: false
AUTH0_CLIENT_SECRET: <set to the key 'client_secret' in secret 'hbutchermoj-rstudio-rstu'> Optional: false
AUTH0_CLIENT_ID: <set to the key 'client_id' in secret 'hbutchermoj-rstudio-rstu'> Optional: false
AUTH0_DOMAIN: <set to the key 'domain' in secret 'hbutchermoj-rstudio-rstu'> Optional: false
AUTH0_CALLBACK_URL: <set to the key 'callback_url' in secret 'hbutchermoj-rstudio-rstu'> Optional: false
COOKIE_SECRET: <set to the key 'cookie_secret' in secret 'hbutchermoj-rstudio-rstu'> Optional: false
SECURE_COOKIE_KEY: <set to the key 'secure_cookie_key' in secret 'hbutchermoj-rstudio-rstu'> Optional: false
COOKIE_MAXAGE: 28800
PROXY_TARGET_HOST: localhost
PROXY_TARGET_PORT: 8787
EXPRESS_HOST: 0.0.0.0
EXPRESS_PORT: 3000
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from hbutchermoj-rstudio-token-525pp (ro)
r-studio-server:
Container ID:
Image: quay.io/mojanalytics/rstudio:3.4.2-5
Image ID:
Port: 8787/TCP
Host Port: 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
cpu: 1500m
memory: 20Gi
Requests:
cpu: 200m
memory: 5Gi
Readiness: http-get http://:http/ delay=5s timeout=1s period=5s #success=1 #failure=3
Environment:
USER: hbutchermoj
AWS_DEFAULT_REGION: <set to the key 'aws_default_region' in secret 'hbutchermoj-rstudio-rstu'> Optional: false
SECURE_COOKIE_KEY: <set to the key 'secure_cookie_key' in secret 'hbutchermoj-rstudio-rstu'> Optional: false
TOOLS_DOMAIN: tools.alpha.mojanalytics.xyz
Mounts:
/home/hbutchermoj from home (rw)
/var/run/secrets/kubernetes.io/serviceaccount from hbutchermoj-rstudio-token-525pp (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
home:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: nfs-home
ReadOnly: false
hbutchermoj-rstudio-token-525pp:
Type: Secret (a volume populated by a Secret)
SecretName: hbutchermoj-rstudio-token-525pp
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m default-scheduler Successfully assigned hbutchermoj-rstudio-rstu-79fbcc995f-rfgzx to ip-192-168-10-66.eu-west-1.compute.internal
Normal SuccessfulMountVolume 4m kubelet, ip-192-168-10-66.eu-west-1.compute.internal MountVolume.SetUp succeeded for volume "nfs-home-hbutchermoj"
Normal SuccessfulMountVolume 4m kubelet, ip-192-168-10-66.eu-west-1.compute.internal MountVolume.SetUp succeeded for volume "hbutchermoj-rstudio-token-525pp"
Normal SandboxChanged 4m (x11 over 4m) kubelet, ip-192-168-10-66.eu-west-1.compute.internal Pod sandbox changed, it will be killed and re-created.
Warning FailedCreatePodSandBox 4m (x12 over 4m) kubelet, ip-192-168-10-66.eu-west-1.compute.internal Failed create pod sandbox.
we have seen that FailedCreatePodSandBox https://github.com/Azure/AKS/issues/496 could be caused by missing unit in Pod resources memory definition but this isn't the case on our cluster. All pods have valid units
This has been resolved. I think the causes was a problem unidling/restarting R Studio - see https://github.com/ministryofjustice/analytics-platform/issues/60#issuecomment-420610740
It still seemed unhappy, even after restarting the pods:
$ kubectl get pods --all-namespaces |grep coroner-stat-tool-ext
apps-prod coroner-stat-tool-ext-webapp-7d5965bd9-6fcw4 3/3 Running 0 2h
$ kubectl describe pods coroner-stat-tool-ext-webapp-7d5965bd9-6fcw4 -n apps-prod
...
Warning Unhealthy 3m (x137 over 26m) kubelet, ip-192-168-14-178.eu-west-1.compute.internal Liveness probe failed: HTTP probe failed with statuscode: 500
The error in the shiny logs is this:
Warning: Error in library: there is no package called ‘leaflet’
Stack trace (innermost first):
41: library
1: runApp
Error : An error has occurred. Check your logs or contact the app author for clarification.
So I think the platform is fine now. The error is with the user's shiny app.
The deployment for
coroner-stat-tool-ext
completed successfully but the webapp pod is stuck in 'Container Creating'