the intention is for this timeout to be used only if there is actual log activity, ie. it should exit if there isn't any. however, in the current implementation, the worker is not able to pick up new workloads for the given log streaming timeout even when all containers are done producing logs.
this issue is limited to pipelines utilizing services and for a reproducible example you can use https://go-vela.github.io/docs/usage/examples/postgres/ . the pipeline will execute the 15s sleep and do an action and be done. the build will be marked completed appropriately, but the service will continue running and waiting for the log stream timeout to finish before the worker is able to accept a new workload.
this can result in unnecessary queue build up (or unnecessary worker pool scaling) since workers would be able to alleviate the pressure if they exited log streaming appropriately.
Description
the worker has a log streaming timeout setting (defaults to 5 mins) - https://github.com/go-vela/worker/blob/b45d0ce710ef208ca1330fc6904c15a38e6d08c7/executor/flags.go#L33-L38. this setting is intended to allow containers some padded time (after the build has finished) to wrap up streaming logs.
the intention is for this timeout to be used only if there is actual log activity, ie. it should exit if there isn't any. however, in the current implementation, the worker is not able to pick up new workloads for the given log streaming timeout even when all containers are done producing logs.
this issue is limited to pipelines utilizing services and for a reproducible example you can use https://go-vela.github.io/docs/usage/examples/postgres/ . the pipeline will execute the 15s sleep and do an action and be done. the build will be marked completed appropriately, but the service will continue running and waiting for the log stream timeout to finish before the worker is able to accept a new workload.
this can result in unnecessary queue build up (or unnecessary worker pool scaling) since workers would be able to alleviate the pressure if they exited log streaming appropriately.
Workaround
lowering the timeout
Value
more efficient workers
Useful Information
vela --version
?0.22.0
example pipeline to test: https://go-vela.github.io/docs/usage/examples/postgres/