nigoroll / varnish_timeouts

collaboration on a VIP for varning timeout reform (VTR)
0 stars 0 forks source link

Idle and total timeouts #1

Open dridi opened 3 years ago

dridi commented 3 years ago

After reading the draft several times I have come to the conclusion that we were very wrong, and that stems from an incorrect understanding of SO_SNDTIMEO: it really is an in-kernel implementation of between_bytes_timeout but for delivery.

Today an idle_send_timeout of 1m doesn´t mean that writev() will return after at most one minute plus epsilon, it means that it will return a short write if it observes one minute of idleness. So while the divide-and-conquer idea would work well with our prior misunderstanding of SO_SNDTIMEO, it would in fact make it even harder to respect today´s send_timeout.

So I´m now of the opinion that we should use idle to mean between_bytes for both resp, beresp and pipe.

That also means backend_idle_timeout becomes a misnomer with this assumption and my suggestion would be to pick something closer to its role: backend_pool_timeout. As in how long backend connections may stay in the pool, and that looks consistent with thread_pool_timeout in terms of semantics.

You also wanted a "total" timeout that seems to be missing from the draft (or was replaced by VCL implementation) and after spending more thoughts on the matter I have found something I think is appropriate:

It´s better than pipe_sess_timeout because sess might already be a subject [1] (unless we pick client as planned today) and since we have more than one type of client tasks, pinning this to [be]req is a good indicator of the scope. It also means that a sub-request could access req_top.task_timeout.

Of course if we can´t guarantee send_timeout today, we won´t be able to do better for resp_send_timeout, let alone req_task_timeout. We could depart from blocking sockets with SO_SNDTIMEO and poll using resp_idle_timeout but that would significantly increase the number of context switches. We could keep resp_send_timeout and req_task_timeout as soft limits that would allow to cap the other request timeouts.

Any objections or questions before I submit a pull request with the following changes?

[1] of course cli_resp_timeout also has the type resp that is otherwise the subject of other timeouts... [2] it seems we confused backend_idle_timeout with beresp_idle_timeout here

nigoroll commented 3 years ago

Thank you for the excellent insight and writeup!

So I´m now of the opinion that we should use idle to mean between_bytes for both resp, beresp and pipe.

I think between_bytes really is a very good name and I would like to keep it. The fact that idle in the case of SO_*TIMEO means that transfer of data must not be idle for longer is something users will not know about if even the two of us got it wrong, and my experience with users is that between_bytes is something which they get right intuitively.

backend_pool_timeout makes sense to me

* `req_task_timeout`
* `bereq_task_timeout`
* `pipe_task_timeout`

:+1:

* remove the divide-and-conquer suggestion
* remove the section on pipe vs backend idle timeouts [2]

:+1:

dridi commented 3 years ago

I will submit a pull request for backend_pool_timeout, *_task_timeout and the removal of some of the suggestions from the VIP.

I´m leaving the idle vs between_bytes question aside for now, as well as the "soft" nature of task and send timeouts.

dridi commented 3 years ago

With #2 merged, that leaves the following open questions in this thread:

I´m in favor of idle because to me it is appropriate, and we can try to clear up any ambiguity in the documentation and as a single word it fits better in the ${subject}_${type}_timeout nomenclature.

I´m also in favor keeping the soft timeouts soft to ease the transition. Nothing would prevent us from hardening them later and evaluate the costs and benefits of moving away from the current blocking delivery. Status quo, not closing the door.

nigoroll commented 3 years ago

While I feel bad about not moving in your direction, I still have a strong preference for between_bytes as it is more descriptive. I think that between_bytes is more likely to be understood as "the maximum time between bytes arriving". As an example, I think idle is good name for the timeout before we close an unused connection, and I think it is misleading for a timeout on an active transaction. Regarding the soft limits, I agree that we should allow ourselves the definition of timeouts which are not yet fully enforced.