openshift-pipelines / pipelines-as-code

Pipelines-as-Code for Tekton
https://pipelinesascode.com
Apache License 2.0
124 stars 81 forks source link

SRVKP-4499: watcher panic with pending runs, concurrency not set #1679

Closed gabemontero closed 2 months ago

gabemontero commented 2 months ago

Changes

During Konflux startup an existing pipelinrun in a PAC managed namespace was set to Pending (that is a Tekton wide API that can be manipulated by other components besides PAC).

The repo did not have the concurrency limit checked.

This panic ensued and PAC watcher ended up in a crash back loop until @savitaashture and I deleted the pending pipeline runs:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1b1905b]
goroutine 41 [running]:
github.com/openshift-pipelines/pipelines-as-code/pkg/sync.(*QueueManager).getSemaphore(0xc0005c53c8, 0xc000a3e0a8)
/go/src/github.com/openshift-pipelines/pipelines-as-code/pkg/sync/queue_manager.go:55 +0x15b
github.com/openshift-pipelines/pipelines-as-code/pkg/sync.(*QueueManager).AddListToQueue(0xc0005c53c8, 0xc000a3e0a8, {0xc000fb42e0, 0x1, 0x0?})
/go/src/github.com/openshift-pipelines/pipelines-as-code/pkg/sync/queue_manager.go:83 +0xf6
github.com/openshift-pipelines/pipelines-as-code/pkg/reconciler.(*Reconciler).queuePipelineRun(0xc00062b680, {0x28aed30, 0xc000626a20}, 0xc000fb42c0?, 0xc00039cb40)
/go/src/github.com/openshift-pipelines/pipelines-as-code/pkg/reconciler/queue_pipelineruns.go:48 +0x245
github.com/openshift-pipelines/pipelines-as-code/pkg/reconciler.(*Reconciler).ReconcileKind(0xc00062b680, {0x28aed30, 0xc0016d5800}, 0xc00039cb40)
/go/src/github.com/openshift-pipelines/pipelines-as-code/pkg/reconciler/reconciler.go:93 +0x5bf
github.com/tektoncd/pipeline/pkg/client/injection/reconciler/pipeline/v1/pipelinerun.(*reconcilerImpl).Reconcile(0xc0006999a0, {0x28aed30, 0xc0016d57d0}, {0xc0018ae7e0, 0x2d})
/go/src/github.com/openshift-pipelines/pipelines-as-code/vendor/github.com/tektoncd/pipeline/pkg/client/injection/reconciler/pipeline/v1/pipelinerun/reconciler.go:236 +0x542
knative.dev/pkg/controller.(*Impl).processNextWorkItem(0xc0000b14a0)
/go/src/github.com/openshift-pipelines/pipelines-as-code/vendor/knative.dev/pkg/controller/controller.go:542 +0x4cd
knative.dev/pkg/controller.(*Impl).RunContext.func3()
/go/src/github.com/openshift-pipelines/pipelines-as-code/vendor/knative.dev/pkg/controller/controller.go:491 +0x68
created by knative.dev/pkg/controller.(*Impl).RunContext
/go/src/github.com/openshift-pipelines/pipelines-as-code/vendor/knative.dev/pkg/controller/controller.go:489 +0x354

I could not get the linters to work on my system. Will reach out to devs separately for that. Apologies.

@savitaashture @chmouel @enarha @piyush-garg FYI / PTAL

Submitter Checklist

gabemontero commented 2 months ago

sorry I don't understand the go-testing failure

+ ./codecov -P 1679 -C 957b764db9a4614b6bf39c1e3b60830e17a6c8fc
./codecov: line 1: syntax error near unexpected token `<'
./codecov: line 1: `NoSuchKeyThe specified key does not exist.
No such object: codecov-uploader/listing//aarch64/codecov

and I get a 404 at https://console-openshift-console.apps.paac.openshift-pipelines.devcluster.openshift.com/k8s/ns/pipelines-as-code-ci/tekton.dev~v1~PipelineRun/go-testing-6tj8x/logs/go

the unit tests worked for me locally

chmouel commented 2 months ago

@gabemontero ah we are in process of moving our dogfooding cluster to arm64 so never mind that error for now, will fix it...

chmouel commented 2 months ago

cc @sm43

codecov[bot] commented 2 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 64.56%. Comparing base (e8a2251) to head (5d05e23). Report is 2 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1679 +/- ## ========================================== + Coverage 64.53% 64.56% +0.02% ========================================== Files 143 143 Lines 11023 11030 +7 ========================================== + Hits 7114 7121 +7 Misses 3388 3388 Partials 521 521 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

chmouel commented 2 months ago

okay let's wait e2e tests now and this should be good to go for me,

gabemontero commented 2 months ago

all clean @chmouel

gabemontero commented 2 months ago

oh and thanks for the updates for CI @chmouel

gabemontero commented 1 month ago

@chmouel @savitaashture do you all cherrypick fixes to earlier version of PAC?

if so, I think this is a candidate.

chmouel commented 1 month ago

@gabemontero I am doing this right know in https://github.com/openshift-pipelines/pipelines-as-code/pull/1680#issuecomment-2095772963