varnishcache / varnish-cache

Varnish Cache source code repository
https://www.varnish-cache.org
Other
3.56k stars 365 forks source link

http: Skip req and beresp trailers #4125

Open dridi opened 2 weeks ago

dridi commented 2 weeks ago

This is the first step of the following plan:

The first step, this pull request, should allow applications producing trailers to work with Varnish when the trailers are not strictly required.

The second step should allow protocols like gRPC to pass through Varnish, with the caveats that body filters could break the meaning of trailers (for example checksums), hence the experimental flag.

It will require internal changes, and from the look of it we may not be able to directly use the existing data structures and we may need to a dedicated OA_TRAILERS object attribute.

The last step will likely require changes to the VCL state machines in addition to new symbols.

Most of the work on trailers was done by @walid-git under my supervision.

This pull request includes work from other pull requests too:

This is a lot of commits, but they should all have a reasonable size for reviewers. I already reviewed Walid's work, and already addressed my own review items while Walid is away. I added Walid or myself as a co-author when I made significant changes (at the scale of individual commits).

dridi commented 2 weeks ago

All failing platforms appear to overflow the workspace used in the h2 test case, except the sanitizers job that fails on something else.

I will have a closer look later.

dridi commented 1 week ago

As it turned out, the failing tests were exposing a race revolving around the request body status. Transitioning to BS_TAKEN would compete against the new transition to BS_TRAILERS (happening in different threads in the h2 case).

I took care of that and added a couple patches next to the ones picked from #3798 at the beginning of this patch series. I'm no longer able to stress test cases and cause failures and I believe the race is gone.

I have not yet studied the remaining CI failure, triggering a panic from the workspace emulator. I suspect either a logic error in the workspace pipelining (expanding from req keep-alive only to req and beresp trailers) or an error in how it translates in the workspace emulator.

I also started an effort to improve feature parity between HTTP/1 and h2 txreq in varnishtest but this is not ready.

dridi commented 1 week ago

As I suspected, the emulated WS_Pipeline() was incorrect. There was a missing workspace release too.

dridi commented 3 days ago

bugwash:

The following commits can be merged: https://github.com/varnishcache/varnish-cache/compare/23ec1fc3fb1198486350e1aedceb5b363d6e429c...98752585d7e086a685816a2934860c72ac0a92c6

The workspace refactoring commits can be submitted independently.

bsdphk commented 3 days ago

I'm fine with you commiting those first six commits while we wait for @nigoroll

dridi commented 2 days ago

As per bugwash: