varnishcache / varnish-cache

Varnish Cache source code repository
https://www.varnish-cache.org
Other
3.68k stars 377 forks source link

net::ERR_CONNECTION_CLOSED on Chrome 69 #2775

Closed CarloCannas closed 6 years ago

CarloCannas commented 6 years ago

Some resources requested on h2 may fail to load with error net::ERR_CONNECTION_CLOSED using Chrome 69.

I received yesterday the latest Chrome upgrade (Chrome 69) and I started noticing some missing images on a website that uses Varnish with HTTP/2 enabled. The problem is not always reproducible: it may load everything fine without errors, but on some pages the error shows up after a few reloads. I was able to reproduce it only on Chrome 69, previous Chrome versions or Firefox seems unable to trigger the bug. I dumped the HTTP/2 traffic between the browser and Varnish: it showed a PRIORITY frame from Chrome, followed immediately by a GOAWAY from Varnish and the closing of the TCP stream. The PRIORITY frame was sent to a stream just closed. That's exactly the problem: Varnish doesn't accept PRIORITY frames on a closed stream and treats them as a protocol violation. This is however not a protocol violation: rfc7540,l,1130,1131 states that

PRIORITY can be sent and received in any stream state

bin/varnishd/http2/cache_http2_proto.c:L928 is the culprit: that return ultimately triggers the GOAWAY, this codepath is reached whenever a PRIORITY frame is received and the corresponding stream doesn't appear in h2->streams.

A minimal testcase which shows the issue:

varnishtest "PRIORITY on closed stream"

server s1 {
    rxreq
    txresp
}

varnish v1 -vcl+backend {} -start

varnish v1 -cliok "param.set feature +http2"
varnish v1 -cliok "param.set debug +syncvsl"

client c1 {
    stream 1 {
        txreq
        rxresp

        delay .1

        txprio
    } -run
    stream 3 {
        txreq
        rxresp
    } -run
} -run
dugwood commented 6 years ago

Seems that 6.0.1 and 6.1.0 are both affected. I've tried to find out if other «big» websites are impacted, for example https://www.fastly.com/customers and can't reproduce.

Perhaps they're still using 5.2 release? Hum, no, just tested 5.2.1 and it's buggy too.

Any plans to release this shortly?

Thanks for the patch!

CarloCannas commented 6 years ago

I read that they use a custom version of Varnish 2.1, so their HTTP2 is probably not provided by Varnish. I too tried to find a big website affected, but it looks like they use something else to do the HTTP2 termination. My patch is still work in progress, but a pull request should arrive soon.

fgsch commented 6 years ago

Fastly's http/2 is based on h2o.

dugwood commented 6 years ago

@CarloCannas thanks for all your work on this bug. Do you think it will be available as a bugfix release when all checks are passed? Or will we need to wait until March for this to be fixed (2 releases a year schedule)?

So far disabling H2 was the only viable option (Chrome 69 is used by more than 25% of all my clients' visitors).

EDIT: disabled H2 by disabling it Hitch alpn-protos, kept Varnish's configuration as is.

CarloCannas commented 6 years ago

@dugwood I'm sorry, but I have no idea. I was planning to ask the maintainers the same thing once the code was merged.

@bsdphk, @Dridi, @daghf: are you going to do a new release in the near future which includes this fix?

We did the same thing on our Varnish instances: we disabled HTTP2 and currently waiting to have an idea of when the release which fixes this will be available.

dridi commented 6 years ago

Yes, 6.0 is the new LTS release so you should expect at the very least a 6.0.2 ideally after this patch is merged and back-ported. Getting the latest h2 fixes in 6.0.2 is our goal.

ingvarha commented 6 years ago

I tried backporting this to 6.0.1 with a bit strange results. My attempt on the backport here: https://paste.fedoraproject.org/paste/1ZOgLI4bAvY4s~0GrMT4-Q

When I build with this, make check fails on tests/r02387.vtc and tests/t02003.vtc :

r02387.log: https://paste.fedoraproject.org/paste/afgnxYpsjnmUv~oRp44ZgQ t02003.log: https://paste.fedoraproject.org/paste/JfqRfKgzkbEZnPyPU2SE2w

Does anybody have a working version of the patch for varnish-6.0.1?

Ingvar

CarloCannas commented 6 years ago

@ingvarha it looks like you only applied the last commit of the pull request, you should apply all three. Please note however that a new 6.0.2 release is going to be released, which includes this fix.

dridi commented 6 years ago

@ingvarha, please give a try to #2802, it is meant to include any bug fix or new feature that doesn't touch the ABIs we care about (VMODs and libvarnishapi consumers).