Open kimlaberinto opened 2 years ago
Copying relevant logs here as GHA logs don't persist:
[ Info: Waiting for test-multi-addprocs job. This could take up to 4 minutes...
Error from server (NotFound): pods "test-multi-addprocs-st7tc" not found
test-multi-addprocs: Error During Test at /home/runner/work/K8sClusterManagers.jl/K8sClusterManagers.jl/test/cluster.jl:279
Test threw exception
Expression: pod_phase(manager_pod) == "Succeeded"
failed process: Process(setenv(`/home/runner/.julia/artifacts/e549ab3a763d3b31e726aa6336c6dbb75ee90a05/bin/kubectl get pod/test-multi-addprocs-st7tc -o 'jsonpath={.status.phase}'`,["PATH=/home/runner/.julia/artifacts/e549ab3a763d3b31e726aa6336c6dbb75ee90a05/bin:/home/runner/work/_temp:/opt/hostedtoolcache/julia/1.7.1/x64/bin:/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin:/home/runner/.local/bin:/opt/pipx_bin:/home/runner/.cargo/bin:/home/runner/.config/composer/vendor/bin:/usr/local/.ghcup/bin:/home/runner/.dotnet/tools:/snap/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin", "DOTNET_SKIP_FIRST_TIME_EXPERIENCE=1", "GITHUB_RUN_NUMBER=208", "GITHUB_REF_NAME=88/merge", "RUNNER_ARCH=X64", "PERFLOG_LOCATION_SETTING=RUNNER_PERFLOG", "LD_LIBRARY_PATH=/opt/hostedtoolcache/julia/1.7.1/x64/bin/../lib/julia:/opt/hostedtoolcache/julia/1.7.1/x64/bin/../lib", "K8S_CLUSTER_TESTS=true", "ACCEPT_EULA=Y", "ANT_HOME=/usr/share/ant", "RUNNER_USER=runner", "LEIN_HOME=/usr/local/lib/lein", "GITHUB_ACTOR=kimlaberinto", "ANDROID_NDK_LATEST_HOME=/usr/local/lib/android/sdk/ndk/23.1.7779620", "USER=runner", "CONDA=/usr/share/miniconda", "GITHUB_REF_PROTECTED=false", "GITHUB_SHA=b39d201f2b4c3780982f770e755e1c6c91503709", "JAVA_HOME=/usr/lib/jvm/temurin-11-jdk-amd64", "GITHUB_API_URL=https://api.github.com", "GITHUB_RUN_ATTEMPT=1", "GITHUB_ACTIONS=true", "VCPKG_INSTALLATION_ROOT=/usr/local/share/vcpkg", "MINIKUBE_HOME=/home/runner/work/_temp", "ANDROID_SDK_ROOT=/usr/local/lib/android/sdk", "SWIFT_PATH=/usr/share/swift/usr/bin", "GOROOT_1_17_X64=/opt/hostedtoolcache/go/1.17.6/x64", "GITHUB_ENV=/home/runner/work/_temp/_runner_file_commands/set_env_52151825-7529-4f10-9231-f2029174696c", "JAVA_HOME_17_X64=/usr/lib/jvm/temurin-17-jdk-amd64", "GITHUB_ACTION_PATH=/home/runner/work/_actions/julia-actions/julia-runtest/v1", "RUNNER_PERFLOG=/home/runner/perflog", "RUNNER_NAME=GitHub Actions 9", "GITHUB_RUN_ID=1780539670", "HOMEBREW_CELLAR=/home/linuxbrew/.linuxbrew/Cellar", "ImageOS=ubuntu20", "NVM_DIR=/home/runner/.nvm", "GITHUB_HEAD_REF=kpl/update-codecov", "GITHUB_RETENTION_DAYS=90", "GITHUB_SERVER_URL=https://github.com", "GITHUB_JOB=cluster-test", "DEBIAN_FRONTEND=noninteractive", "RUNNER_TRACKING_ID=github_ee352480-6154-44ca-8750-7f7c692fd5f1", "RUNNER_TOOL_CACHE=/opt/hostedtoolcache", "HOMEBREW_CLEANUP_PERIODIC_FULL_DAYS=3650", "AZURE_EXTENSION_DIR=/opt/az/azcliextensions", "HOMEBREW_NO_AUTO_UPDATE=1", "CHROMEWEBDRIVER=/usr/local/share/chrome_driver", "GITHUB_ACTION_REPOSITORY=", "GITHUB_WORKFLOW=CI", "GITHUB_ACTION=__julia-actions_julia-runtest", "HOME=/home/runner", "JAVA_HOME_8_X64=/usr/lib/jvm/temurin-8-jdk-amd64", "GITHUB_EVENT_PATH=/home/runner/work/_temp/_github_workflow/event.json", "K8S_CLUSTER_MANAGERS_TEST_IMAGE=k8s-cluster-managers:b39d201", "HOMEBREW_PREFIX=/home/linuxbrew/.linuxbrew", "SGX_AESM_ADDR=1", "GITHUB_REF=refs/pull/88/merge", "GITHUB_REPOSITORY=beacon-biosignals/K8sClusterManagers.jl", "INVOCATION_ID=3990f835d3004e2b87571c73a406a265", "ImageVersion=20220123.1", "LANG=C.UTF-8", "GITHUB_GRAPHQL_URL=https://api.github.com/graphql", "SHLVL=1", "DOTNET_MULTILEVEL_LOOKUP=0", "RUNNER_WORKSPACE=/home/runner/work/K8sClusterManagers.jl", "GITHUB_BASE_REF=main", "STATS_KEEPALIVE=false", "_=/opt/hostedtoolcache/julia/1.7.1/x64/bin/julia", "HOMEBREW_REPOSITORY=/home/linuxbrew/.linuxbrew/Homebrew", "GRADLE_HOME=/usr/share/gradle-7.3.3", "GITHUB_ACTION_REF=", "DEPLOYMENT_BASEPATH=/opt/runner", "PIPX_HOME=/opt/pipx", "ANDROID_NDK_ROOT=/usr/local/lib/android/sdk/ndk-bundle", "***", "GITHUB_WORKSPACE=/home/runner/work/K8sClusterManagers.jl/K8sClusterManagers.jl", "GRAALVM_11_ROOT=/usr/local/graalvm/graalvm-ce-java11-21.3.0", "XDG_CONFIG_HOME=/home/runner/.config", "ANDROID_HOME=/usr/local/lib/android/sdk", "CHROME_BIN=/usr/bin/google-chrome", "CI=true", "POWERSHELL_DISTRIBUTION_CHANNEL=GitHub-Actions-ubuntu20", "GECKOWEBDRIVER=/usr/local/share/gecko_driver", "GITHUB_PATH=/home/runner/work/_temp/_runner_file_commands/add_path_52151825-7529-4f10-9231-f2029174696c", "RUNNER_OS=Linux", "JOURNAL_STREAM=8:20833", "GITHUB_REF_TYPE=branch", "LEIN_JAR=/usr/local/lib/lein/self-installs/leiningen-2.9.8-standalone.jar", "JULIA_LOAD_PATH=@:/tmp/jl_RrxcF6", "BOOTSTRAP_HASKELL_NONINTERACTIVE=1", "PIPX_BIN_DIR=/opt/pipx_bin", "SELENIUM_JAR_PATH=/usr/share/java/selenium-server.jar", "JAVA_HOME_11_X64=/usr/lib/jvm/temurin-11-jdk-amd64", "RUNNER_TEMP=/home/runner/work/_temp", "GOROOT_1_16_X64=/opt/hostedtoolcache/go/1.16.13/x64", "GITHUB_REPOSITORY_OWNER=beacon-biosignals", "GITHUB_EVENT_NAME=pull_request", "DOTNET_NOLOGO=1", "GOROOT_1_15_X64=/opt/hostedtoolcache/go/1.15.15/x64", "OPENBLAS_MAIN_FREE=1", "ANDROID_NDK_HOME=/usr/local/lib/android/sdk/ndk-bundle", "AGENT_TOOLSDIRECTORY=/opt/hostedtoolcache"]), ProcessExited(1)) [1]
Stacktrace:
[1] pipeline_error
@ ./process.jl:531 [inlined]
[2] read(cmd::Cmd)
@ Base ./process.jl:418
[3] read(cmd::Cmd, #unused#::Type{String})
@ Base ./process.jl:427
[4] pod_phase(pod_name::SubString{String})
@ Main ~/work/K8sClusterManagers.jl/K8sClusterManagers.jl/test/utils.jl:36
[5] macro expansion
@ /opt/hostedtoolcache/julia/1.7.1/x64/share/julia/stdlib/v1.7/Test/src/Test.jl:445 [inlined]
[6] macro expansion
@ ~/work/K8sClusterManagers.jl/K8sClusterManagers.jl/test/cluster.jl:271 [inlined]
[7] macro expansion
@ /opt/hostedtoolcache/julia/1.7.1/x64/share/julia/stdlib/v1.7/Test/src/Test.jl:1283 [inlined]
[8] top-level scope
@ ~/work/K8sClusterManagers.jl/K8sClusterManagers.jl/test/cluster.jl:235
Error from server (NotFound): jobs.batch "test-multi-addprocs" not found
[ Info: Describe job:
┌ Info: List pods for job test-multi-addprocs:
│ NAME READY STATUS RESTARTS AGE JOB-NAME=TEST-MULTI-ADDPROCS
│ test-multi-addprocs-st7tc-worker-fnzvf 0/1 Completed 0 43s
│ test-multi-addprocs-st7tc-worker-jwksk 0/1 Completed 0 28s
└ test-success-slxr5-worker-jqvm7 0/1 Completed 0 95s
[ Info: Manager pod "test-multi-addprocs-st7tc" not found
┌ Info: Describe worker 1/2 pod:
│ Name: test-multi-addprocs-st7tc-worker-fnzvf
│ Namespace: default
│ Priority: 0
│ Node: minikube-m02/192.168.49.3
│ Start Time: Tue, 01 Feb 2022 20:33:00 +0000
│ Labels: manager=test-multi-addprocs-st7tc
│ worker-id=2
│ Annotations: <none>
│ Status: Succeeded
│ IP: 10.244.1.8
│ IPs:
│ IP: 10.244.1.8
│ Containers:
│ worker:
│ Container ID: docker://25369b89560204270b609c7129b3c36111f5e124b09c10609d946879ce9c52c7
│ Image: k8s-cluster-managers:b39d201
│ Image ID: docker://sha256:a3f7dfa9c373b41e28bf6527e7b8801720aa125fa31db5b9cacb7d069eada486
│ Port: <none>
│ Host Port: <none>
│ Command:
│ /usr/local/julia/bin/julia
│ --worker=RL03XtNp463y3yuY
│ State: Terminated
│ Reason: Completed
│ Exit Code: 0
│ Started: Tue, 01 Feb 2022 20:33:00 +0000
│ Finished: Tue, 01 Feb 2022 20:33:29 +0000
│ Ready: False
│ Restart Count: 0
│ Limits:
│ cpu: 500m
│ memory: 300Mi
│ Requests:
│ cpu: 500m
│ memory: 300Mi
│ Environment: <none>
│ Mounts:
│ /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-w6rvq (ro)
│ Conditions:
│ Type Status
│ Initialized True
│ Ready False
│ ContainersReady False
│ PodScheduled True
│ Volumes:
│ kube-api-access-w6rvq:
│ Type: Projected (a volume that contains injected data from multiple sources)
│ TokenExpirationSeconds: 3607
│ ConfigMapName: kube-root-ca.crt
│ ConfigMapOptional: <nil>
│ DownwardAPI: true
│ QoS Class: Guaranteed
│ Node-Selectors: <none>
│ Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
│ node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
│ Events:
│ Type Reason Age From Message
│ ---- ------ ---- ---- -------
│ Normal Scheduled 43s default-scheduler Successfully assigned default/test-multi-addprocs-st7tc-worker-fnzvf to minikube-m02
│ Normal Pulled 43s kubelet Container image "k8s-cluster-managers:b39d201" already present on machine
│ Normal Created 43s kubelet Created container worker
└ Normal Started 43s kubelet Started container worker
┌ Info: Describe worker 2/2 pod:
│ Name: test-multi-addprocs-st7tc-worker-jwksk
│ Namespace: default
│ Priority: 0
│ Node: minikube-m02/192.168.49.3
│ Start Time: Tue, 01 Feb 2022 20:33:15 +0000
│ Labels: manager=test-multi-addprocs-st7tc
│ worker-id=3
│ Annotations: <none>
│ Status: Succeeded
│ IP: 10.244.1.9
│ IPs:
│ IP: 10.244.1.9
│ Containers:
│ worker:
│ Container ID: docker://ddbfb6efe6eaf5aac7fe6a2885e49d145248f02be166e3c83f78ce15936a72e5
│ Image: k8s-cluster-managers:b39d201
│ Image ID: docker://sha256:a3f7dfa9c373b41e28bf6527e7b8801720aa125fa31db5b9cacb7d069eada486
│ Port: <none>
│ Host Port: <none>
│ Command:
│ /usr/local/julia/bin/julia
│ --worker=RL03XtNp463y3yuY
│ State: Terminated
│ Reason: Completed
│ Exit Code: 0
│ Started: Tue, 01 Feb 2022 20:33:16 +0000
│ Finished: Tue, 01 Feb 2022 20:33:29 +0000
│ Ready: False
│ Restart Count: 0
│ Limits:
│ cpu: 500m
│ memory: 300Mi
│ Requests:
│ cpu: 500m
│ memory: 300Mi
│ Environment: <none>
│ Mounts:
│ /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-nlpwr (ro)
│ Conditions:
│ Type Status
│ Initialized True
│ Ready False
│ ContainersReady False
│ PodScheduled True
│ Volumes:
│ kube-api-access-nlpwr:
│ Type: Projected (a volume that contains injected data from multiple sources)
│ TokenExpirationSeconds: 3607
│ ConfigMapName: kube-root-ca.crt
│ ConfigMapOptional: <nil>
│ DownwardAPI: true
│ QoS Class: Guaranteed
│ Node-Selectors: <none>
│ Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
│ node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
│ Events:
│ Type Reason Age From Message
│ ---- ------ ---- ---- -------
│ Normal Scheduled 28s default-scheduler Successfully assigned default/test-multi-addprocs-st7tc-worker-jwksk to minikube-m02
│ Normal Pulled 27s kubelet Container image "k8s-cluster-managers:b39d201" already present on machine
│ Normal Created 27s kubelet Created container worker
└ Normal Started 27s kubelet Started container worker
[ Info: No logs for manager (test-multi-addprocs-st7tc)
┌ Info: Logs for worker 1/2 (test-multi-addprocs-st7tc-worker-fnzvf):
└ julia_worker:9001#10.244.1.8
┌ Info: Logs for worker 2/2 (test-multi-addprocs-st7tc-worker-jwksk):
└ julia_worker:9001#10.244.1.9
Appears the manager job was terminated and removed before debugging information could be rendered. Probably means we want to adjust some TTL settings so this can be debugged further
Not sure why this CI cluster test failed:
https://github.com/beacon-biosignals/K8sClusterManagers.jl/runs/5027583871?check_suite_focus=true#step:9:140