actix / actix-web

Actix Web is a powerful, pragmatic, and extremely fast web framework for Rust.
https://actix.rs
Apache License 2.0
21.6k stars 1.67k forks source link

Streaming response - inconsistent behaviour #1267

Closed bekh6ex closed 4 years ago

bekh6ex commented 4 years ago

I've ran into the following issue recently while returning chunked encoded response:

I stream some response by chunk and some chunks are empty. What happens is that if I request the endpoint requiring compression I'm getting the full response, but if I request it uncompressed I get only the part of data (the one before the first empty chunk).

It aligns with HTTP standard in a way: chunked response is terminated by a chunk with 0 length, but makes the usage of the framework a bit confusing (it took me a day to figure out the reason of this wired behavior).

My current suggestion would be that actix-web should just drop empty chunks from the given stream (this is what I did to solve the issue), but there might be another view on that.

Code to reproduce

Cargo.toml

[package]
name = "actix-web-streaming-bug"
version = "0.1.0"
edition = "2018"

[dependencies]
actix-web="2.0.0"
actix-rt = "1.0.0"
bytes = "0.5.3"
futures = "0.3.1"

main.rs

use actix_web::{middleware, web, App, HttpRequest, HttpResponse, HttpServer, Responder};
use bytes::Bytes;

use futures::stream::iter;

use futures::{self, StreamExt};

async fn handle_request(_req: HttpRequest) -> impl Responder {
    let response = vec![
        Bytes::from("1\n"),
        Bytes::from(""),
        Bytes::from("2\n")
    ];
    HttpResponse::Ok().streaming(iter(response).map(|b| Ok(b) as Result<Bytes, ()>))
}

#[actix_rt::main]
async fn main() -> std::io::Result<()> {
    HttpServer::new(move || {
        App::new()
            .wrap(middleware::Compress::default())
            .service(web::resource("/").to(handle_request))
    })
    .bind("127.0.0.1:8080")
    .unwrap()
    .workers(1)
    .run()
    .await
}
$ curl 127.0.0.1:8080
1

$ curl --compressed 127.0.0.1:8080
1
2
tyranron commented 4 years ago

@JohnTitor I'll try to dig and propose a fix.

tyranron commented 4 years ago

@JohnTitor ok, in the nutshell @bekh6ex has give a problem description and cause:

It aligns with HTTP standard in a way: chunked response is terminated by a chunk with 0 length, but makes the usage of the framework a bit confusing (it took me a day to figure out the reason of this wired behavior).

From Wiki about Chunked transfer encoding:

Each chunk is preceded by its size in bytes. The transmission ends when a zero-length chunk is received. The chunked keyword in the Transfer-Encoding header is used to indicate chunked transfer.

Currently, any Stream provided into Response just polled "as is":

impl<S, E> MessageBody for BodyStream<S, E>
where
    S: Stream<Item = Result<Bytes, E>>,
    E: Into<Error>,
{
    fn size(&self) -> BodySize {
        BodySize::Stream
    }

    fn poll_next(&mut self, cx: &mut Context<'_>) -> Poll<Option<Result<Bytes, Error>>> {
        unsafe { Pin::new_unchecked(self) }
            .project()
            .stream
            .poll_next(cx)
            .map(|res| res.map(|res| res.map_err(std::convert::Into::into)))
    }
}

So, I assume there are following ways to deal with this situation:

  1. Do not do anything. Consider this behavior as intentional to align with HTTP standard and document this explicitly in API docs.

    • 👍 no performance impact at all
    • 👎 possible source of future confusions despite the docs
  2. Filter out empty chunks from the given Stream, so finish streaming when the Stream ends, but not on the first empty chunk. Still document this in API docs.

    • 👍 obvious and natural behavior for the user code, no gotchas
    • 👎 minor performance impact as every chunk should be examined for emptiness
  3. Provide an alternative API method (like .streaming_skip_empty()) to ResponseBuilder with the behavior described in point 2 above. Document both properly.

    • 👍 obvious and natural behavior for the user code, no gotchas
    • 👍 no performance impact, as API user chooses the desired behavior for his situation
    • 👎 complicates API

Regarding my point of view, I'm in favor of point 2, because:

ack @robjtede @Dowwie @mitsuhiko
It would be nice to have your opinions as well. Thanks!

Once we decide the way, I'll submit an appropriate PR for this.

robjtede commented 4 years ago

+1 for action 2. Keeping the behavior doesnt do anyone any good.

Consider this a bug ?

mitsuhiko commented 4 years ago

I think the tiny performance impact seems reasonable. This is an unnecessary gotcha.

godofdream commented 4 years ago

As a User I would assume "2." is default. I wouldn't have read the api docs, because my ide autofills the function.

cdbattags commented 4 years ago

I got bit by this at the very beginning of my actix-web experience as well! Thanks for a concise write-up, benchmark and fix!