Closed rafalbigaj closed 2 years ago
Thank you for the bug report @rafalbigaj - it sounds like we don't have test coverage for this. Perhaps we should consider installing a simple custom run controller in our e2e tests so we may test these scenarios. /cc @abayer @lbernick
@rafalbigaj Does the cancel happen right away? Both runs are in the status, which is why the controller attempts to cancel both and fails. It could be that that is the source of the issue. Are you using minimal embedded status in your setup? Any other alpha flag enabled?
Perhaps we should consider installing a simple custom run controller in our e2e tests so we may test these scenarios
👍 Agreed!
Perhaps we should consider installing a simple custom run controller in our e2e tests so we may test these scenarios
Would this require changing pluming to deploy wait task controller or can it be done via e2e-common
(similar to pipeline controller)? Should we bring wait task controller into pipeline repo to decouple from whats available in experimental repo?
I wrote a reproducer unit test, so we can test this even without e2e tests, nonetheless it would be good to install the wait
controller.
/*
Copyright 2022 The Tekton Authors
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package pipelinerun
import (
"fmt"
"testing"
"github.com/tektoncd/pipeline/pkg/apis/pipeline/v1beta1"
"github.com/tektoncd/pipeline/test"
"github.com/tektoncd/pipeline/test/parse"
corev1 "k8s.io/api/core/v1"
logtesting "knative.dev/pkg/logging/testing"
_ "knative.dev/pkg/system/testing" // Setup system.Namespace()
)
func TestReconcileTwoCustomTasks(t *testing.T) {
pipelineRunName := "cancel-test-run"
prs := []*v1beta1.PipelineRun{parse.MustParsePipelineRun(t, `metadata:
name: cancel-test-run
namespace: foo
spec:
pipelineSpec:
tasks:
- name: wait-1
taskSpec:
apiVersion: example.dev/v0
kind: Wait
params:
- name: duration
value: 1h
- name: wait-2
runAfter:
- wait-1
taskSpec:
apiVersion: example.dev/v0
kind: Wait
params:
- name: duration
value: 10s
`)}
cms := []*corev1.ConfigMap{withCustomTasks(newFeatureFlagsConfigMap())}
d := test.Data{
PipelineRuns: prs,
ConfigMaps: cms,
}
prt := newPipelineRunTest(d, t)
defer prt.Cancel()
pr, clients := prt.reconcileRun("foo", pipelineRunName, []string{}, false)
fmt.Printf("%v", pr)
err := cancelPipelineRun(prt.TestAssets.Ctx, logtesting.TestLogger(t), pr, clients.Pipeline)
if err != nil {
t.Fatalf("Error found: %v", err)
}
}
I used git bisect
to find the first failing commit 9d60d0adf41e74f78029a0eaab73140fa4f7206a
$ git bisect start v0.37.0 v0.36.0 --
$ git bisect run go test -v github.com/tektoncd/pipeline/pkg/reconciler/pipelinerun -run TestReconcileTwoCustomTasks
(...)
9d60d0adf41e74f78029a0eaab73140fa4f7206a is the first bad commit
On the surface it looks unrelated, I need to dig in to understand what happened.
The issue happens only with full status - with minimal status it works fine - i.e. in the test, using:
cms := []*corev1.ConfigMap{withCustomTasks(withEmbeddedStatus(newFeatureFlagsConfigMap(), config.MinimalEmbeddedStatus))}
Works on main
.
Expected Behavior
Pipeline run with unscheduled custom test should be cancellable.
Regression: This scenario works correctly on v0.33.1.
Actual Behavior
When cancelling a pipeline run, which contains unscheduled custom test, the error
PipelineRunCouldntCancel
occurs.Steps to Reproduce the Problem
spec.status
to "Cancelled". The finalPipelineRun
status is:Additional Info
Kubernetes version:
Output of
kubectl version
: