Closed onprem closed 2 years ago
It seems like the issue might be with github actions not having enough resources to run the whole stack:
Warning FailedScheduling 6s (x5 over 67s) default-scheduler 0/1 nodes are available: 1 Insufficient cpu. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
Could you test this theory by temporarily removing resource requests from prometheus, promscale, and timescaledb pods (done in chart/values.yaml
)
If that is the case, we can create values.yaml
patches and place them in chart/ci/XYZ-values.yaml
files as per ct install docs
:
Charts may have multiple custom values files matching the glob pattern '*-values.yaml' in a directory named 'ci' in the root of the chart's directory. The chart is installed and tested for each of these files. If no custom values file is present, the chart is installed and tested with defaults.
Waiting on #493 as that will fix the timeout issue with ct install
.
It seems like the issue might be with github actions not having enough resources to run the whole stack:
Warning FailedScheduling 6s (x5 over 67s) default-scheduler 0/1 nodes are available: 1 Insufficient cpu. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
Could you test this theory by temporarily removing resource requests from prometheus, promscale, and timescaledb pods (done in
chart/values.yaml
)If that is the case, we can create
values.yaml
patches and place them inchart/ci/XYZ-values.yaml
files as perct install docs
:Charts may have multiple custom values files matching the glob pattern '*-values.yaml' in a directory named 'ci' in the root of the chart's directory. The chart is installed and tested for each of these files. If no custom values file is present, the chart is installed and tested with defaults.
Looks like the case to me as well. We already have a values.yaml there in this PR to disable telemetry. I'll remove the resource requests and then test.
Now the CI is failing because ct
deletes the namespace in the end, with a hardcoded timeout of 180s
. But with PVCs and finalisers, tobs almost always takes longer than 180s
to delete the namespace, causing ct install
to fail.
Upstream issue: helm/chart-testing#227
The workaround the namespace deletion timeout issue is to do ct install
in an existing namespace. This way, ct doesn't try to remove the namespace, bypassing the namespace uninstall timeout.
There is a new problem though, ct
runs helm test
after an install, and since all helm commands have common extraArgs
in configuration, and we need --wait
for install, helm test
fails as it doesn't have a flag called --wait
.
Update: since the --wait
flag is implicitly added by ct
here, I just removed the flag, getting helm test
to work.
Description
Fixes #396
Type of change
What type of changes does your code introduce to tobs? Put an
x
in the box that apply.CHANGE
(fix or feature that would cause existing functionality to not work as expected)FEATURE
(non-breaking change which adds functionality)BUGFIX
(non-breaking change which fixes an issue)ENHANCEMENT
(non-breaking change which improves existing functionality)NONE
(if none of the other choices apply. Example, tooling, build system, CI, docs, etc.)