tower-rs / tower-http

HTTP specific Tower utilities.
680 stars 159 forks source link

CompressionLayer breaks range requests, violating RFC 7233 #416

Closed and-reas-se closed 7 months ago

and-reas-se commented 11 months ago

Bug Report

Version

tower-http v0.4.3

Platform

Linux 6.5.2 x86_64

Description

Say you have a large file served by a ServeDir and with a CompressionLayer thrown on top. ServeDir supports range requests. If the client requests to get a file starting at from bytes x to bytes y, and the response is compressed, this request should be interpreted as compressed bytes from x to y. But these two in combination will interpret it as uncompressed bytes x to y, so the wrong set of bytes are returned.

Furthermore the content-range header is set incorrectly in the response. It's also based on uncompressed bytes rather than compressed.

I ran in to this problem in a real life scenario. It intermittently breaks Microsofts Azure CDN (Front Door) which will sometimes use range requests when requesting files from the origin.

The workaround that solved the problem for me was to write a small axum middleware that strips the range header from requests and accept-ranges header from the response (reproduced below).

use axum::{http::Request, middleware::Next, response::Response};

pub async fn remove_range<B: std::fmt::Debug>(mut req: Request<B>, next: Next<B>) -> Response {
    req.headers_mut().remove("range");
    let mut response = next.run(req).await;
    response.headers_mut().remove("accept-ranges");
    response
}
jplatte commented 11 months ago

Right so the (de)compression middlewares need to either strip, or if possible adjust these headers. Makes sense.

jplatte commented 10 months ago

Actually maybe the better solution is for the middlewares to disable themselves on such requests. Otherwise requesting the end of a huge file could lead to a full re-transmission for no reason.

jplatte commented 10 months ago

Given David's approval of my previous comment, a PR implementing it would be welcome.

and-reas-se commented 10 months ago

What if a client starts downloading a large file using a non-range request, the connection drops partway trough, and then the client tries to resume the download using a range request? Wouldn't you get a corrupted file if the compression middlewares are enabled for the first request and disabled for the second?

jplatte commented 10 months ago

Hm, I see what you mean. Really the whole range request for compressed data thing critically relies on the server having the full compressed content cached. That is annoying. Maybe the only solution is really what you wrote then. But there's another problem, if ServeDir supported compression itself, wrapping it in a compression layer should be a no-op, but as written will also filter out those headers 😕

seanmonstar commented 10 months ago

I doubt this is the first time this has been figured out. Perhaps other server frameworks in other languages could show what to do.