tektoncd / results

Long term storage of execution results.
Apache License 2.0
78 stars 74 forks source link

[WIP] Logs API apporach to fix race condition due to pruning in results watcher #713

Closed ramessesii2 closed 2 months ago

ramessesii2 commented 9 months ago

Changes

Fixes #514 /kind bug

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you review them:

Release Notes

Free up resources (PipelineRun/TaskRun) potentially without any race conditions in pruning w.r.t to streaming logs even with no Grace Period. 
tekton-robot commented 9 months ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: To complete the pull request process, please assign dibyom after the PR has been reviewed. You can assign the PR to them by writing /assign @dibyom in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/tektoncd/results/blob/main/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
tekton-robot commented 9 months ago

The following is the coverage report on the affected files. Say /test pull-tekton-results-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/watcher/reconciler/dynamic/dynamic.go 69.3% 66.0% -3.3
pkg/watcher/results/logs.go 50.0% 40.9% -9.1
ramessesii2 commented 9 months ago

/hold For #704

tekton-robot commented 9 months ago

@ramessesii2: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-tekton-results-build-tests 52942275a18350262c5db3a1af78c1dd33e0505c link true /test pull-tekton-results-build-tests

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
tekton-robot commented 7 months ago

@ramessesii2: PR needs rebase.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
gabemontero commented 4 months ago

FYI @khrm now that #760 has merged and this WIP can be rebased and updated to use it, I'll also cross reference with the other thread we have been working, namely the rewrite of log storage, https://github.com/tektoncd/results/issues/763 , to access S3 directly from the watcher.

With that change, presumably the fix for the race condition becomes much simpler / moot, as the watcher will learn directly of error writing the logs to external storage.

I suppose it is possible that leveraging the #760 changes could still provide value as part of coordination, and of course there is the question of staging and when #763 is done (though I hope to get it prioritized on our team's end when we can next talk to Koustav.

@sayan-biswas @ramessesii2 @avinal @enarha @vdemeester FYI

khrm commented 2 months ago

We can close this because we are changing the Logging approach.

khrm commented 2 months ago

/close

khrm commented 2 months ago

And to fix the racing condition we will have a finalizer now. https://github.com/tektoncd/results/pull/797