Open ashrafguitoni opened 10 months ago
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.
This probably means that requests are timing out and they get cancelled, max-revision-timeout-seconds
is 10m by default .
@skonto Well, the defaults are set to this:
revision-timeout-seconds: "7200"
max-revision-timeout-seconds: "7201"
revision-response-start-timeout-seconds: "7200"
And the per-revision timeoutSeconds
and responseStartTimeoutSeconds
are also set to a high value, much higher than 10 mins... Any other reasons why this could be happening?
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.
/area networking
What version of Knative?
Expected Behavior
Requests complete successfully.
Actual Behavior
Requests fail with a 502 error code even though the user container returns a 200 response. See the Jaeger tracing visualization:
Looking at the queue proxy logs, I see the following error message:
Steps to Reproduce the Problem
Since we're getting this issue with our fairly complex system, I don't have a full reproducible example now, but our team is trying.
What we noticed for sure though is, this issue seems to only happen if the traffic exceeds the revisions' current capacity (so autoscaling is triggered). We use
target-burst-capacity: -1
for our services.