varnishcache / varnish-cache

Varnish Cache source code repository
https://www.varnish-cache.org
Other
3.64k stars 374 forks source link

VIP30, plumbing a better vcl_pipe #3471

Open bsdphk opened 3 years ago

bsdphk commented 3 years ago

See: https://github.com/varnishcache/varnish-cache/wiki/VIP30%3A-Plumbing%3A-vcl_raw%28%29-and-vcl_pipe%28%29

This ticket for discussion.

nigoroll commented 3 years ago

I do fully agree that we should make improvements in this area and I think the general approach of adding vcl_raw {} and a new "pipe-fsm" is the right one.

vcl_raw {} not only lays the foundation for the new pipe handling, but also allows moving C-code handling to VCL, as with the X-F-F case in #3453

Low overhead pipe handling in a single (or a low number of) thread(s) will substantially improve a relevant practical use case.

So, in short, I am very much :+1: overall.

Suggested clarifications

separate fsms == separate tasks

My understanding is that the tunnel fsm would be separate from both the client and the backend fsm. This implies that it would also represent a separate task wrt PRIV_TASK, as vcl_raw {} would.

So we should make clear that the scope of PRIV_TASKs created within vcl_raw {} and vcl_tunnel_* {} will be limited that (set of) sub(s).

separate fsms == separate subset of vcl subs

the VIP specifies return(synth) from vcl_tunnel{} and vcl_raw{}. These would require separate incarnation of the synth sub, possibly called vcl_tunnel_synth {} and vcl_raw_synth {}.

vcl_raw {}, no rollbacks/restarts and req0

We should clarify that vcl_raw {} does neither support rollbacks nor restarts.

As whatever modification done in vcl_raw {} will be the state that the client fsm would roll back to, we should use req0 (or a another name different from req) in vcl_raw {}.

return(pipe) from vcl_tunnel_* {}

We should use a different name to disambiguate from the existing return(pipe). For example, return(stream) or return(connect).

Suggested changes

Keep vcl_pipe {} for transition

As much as we want to replace vcl_pipe {} with a better design, I think we should keep it in a VCL/vmod compatible way for 1-2 releases to allow for a smooth transition to the new concept on the side of VCL authors. While we could reimplement vcl_pipe {} to make use of the new new plumbing, as long as we want to keep it in a VCL compatible way, we also need to find a way to live with its issues (see the vcl_pipe saga #3408 / #3470).

On the proposed "tunnel" VCL syntax / concept.

I am not sure the new concept of a single sub to handle both req.* and beresp.* will help VCL authors.

To start with, which values would beresp.http.* return for the initial call to the tunnel sub? Also, every VCL author would need to learn a new concept.

return(tunnel(backend)) looks neat for simple cases, but we need to consider cases where backend selection is a more elaborate process, and making it happen in vcl_raw {} for the initial attempt but in vcl_tunnel {} for retries seems cumbersome.

So I think we should mirror the existing request/response sub concept from vcl_backend_* {} (for example, as vcl_tunnel_{request,response} {}) and mirror existing facilities:

Related comments

return(reset)

This could also be used by a VSL parser to implement external blacklisting (similar to fail2ban), so should we add a reason to only go to the log?

return(vcl(name))

In my mind, vcl_raw {} would be the right place for VCL switching, and also the only place.

H2

Even if vcl_tunnel {} would only work with HTTP1, I think it would be good to see how VCL for H1 in H2 tunneling would look like before we even implement the proposed new HTTP1 tunneling.

nigoroll commented 3 years ago

notes from bugwash from sejerman wis se bullat points:

nigoroll commented 3 years ago

Suggestion for 100-continue handling from pow-wow:

builtin.vcl mockup:

vcl_raw {
    if (req0.http.Expect == "100-continue") {
        std.send_100_continue();
        unset req0.http.Expect;
    }
}
dridi commented 3 years ago

Suggestion related to this topic:

The suffixes "begin" and "end" match VSL tags delimiting transactions.

In that spirit, the role of vcl_begin is to decide whether we formally receive a request, tunnel it to a backend or stop right there.

nigoroll commented 3 years ago

I think @Dridi 's suggestion makes sense. The only drawback I can see is lack of symmetry due to vcl_backend_begin and vcl_tunnel_begin missing.

dridi commented 3 years ago

We can add those later if we ever have a case for them.

nigoroll commented 3 years ago

yes, that was not meant as an argument against your suggestion.