tower-rs / tower-http

HTTP specific Tower utilities.
675 stars 156 forks source link

ServeFile is slower than simply reading a file #480

Open ConsoleC137 opened 5 months ago

ConsoleC137 commented 5 months ago

Bug Report

Version

tower-http v0.5.2

Platform

Linux [..] 6.7.8-arch1-1 #1 SMP PREEMPT_DYNAMIC Sun, 03 Mar 2024 00:30:36 +0000 x86_64 GNU/Linux

Crates

[dependencies]
axum = version = "0.7.4"
tokio = { version = "1.36.0", features = ["full"] }
tower-http = { version = "0.5.2", features = ["fs"]}

Description

The tests were performed with the wrk utility. My friend also ran tests with rewrk utility, the situation is similar. I expected to see this happen: ServeFile and read from file will show identical readings. Instead, this happened: ServeFile proved to be slower. I tried this code:

use axum::{routing::get, Router};
use tower_http::services::ServeFile;

mod api;

#[tokio::main]
async fn main() {
    let html = std::fs::read_to_string("static/html/index.html").unwrap();

    let app = Router::new()
        .route("/1", get(api::index).with_state(html.leak()))
        .route("/2", get(api::index2))
        .nest_service("/3", ServeFile::new("static/html/index.html"));

    let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
        .await
        .unwrap();
    println!("listening on {}", listener.local_addr().unwrap());
    axum::serve(listener, app).await.unwrap();
}

api.rs:

use axum::{extract::State, response::{Html, IntoResponse}};

pub async fn index(State(state): State<&'static str>) -> impl IntoResponse {
    Html(state.to_string())
}

pub async fn index2() -> impl IntoResponse {
    let html = tokio::fs::read_to_string("static/html/index.html").await.unwrap();
    Html(html)
}

"/1" - file is read into memory in advance "/2" - read data from the file, send it "/3" - ServeFile

Test results: In this test, the file weighs 10kb.

Running 30s test @ http://127.0.0.1:3000/1
  4 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     7.11ms    3.30ms  38.70ms   77.42%
    Req/Sec    23.39k     1.14k   26.86k    73.14%
  2791777 requests in 30.10s, 28.63GB read
Requests/sec:  92735.21
Transfer/sec:      0.95GB

Running 30s test @ http://127.0.0.1:3000/2
  4 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    22.32ms   10.53ms  87.24ms   65.98%
    Req/Sec    11.21k   570.81    13.04k    71.82%
  1334820 requests in 30.04s, 13.69GB read
Requests/sec:  44435.60
Transfer/sec:    466.70MB

Running 30s test @ http://127.0.0.1:3000/3
  4 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    62.08ms   16.15ms 156.63ms   79.43%
    Req/Sec     4.03k   217.68     4.70k    69.31%
  479628 requests in 30.02s, 4.94GB read
Requests/sec:  15976.21
Transfer/sec:    168.61MB

In this test, the file weighs 100kb.

Running 30s test @ http://127.0.0.1:3000/1
  4 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    20.03ms    8.55ms 111.49ms   76.94%
    Req/Sec     7.46k   539.25     8.72k    75.13%
  888612 requests in 30.10s, 87.79GB read
Requests/sec:  29526.30
Transfer/sec:      2.92GB

Running 30s test @ http://127.0.0.1:3000/2
  4 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    63.14ms   28.77ms 225.17ms   66.77%
    Req/Sec     3.95k   321.27     5.55k    73.68%
  470250 requests in 30.10s, 46.46GB read
Requests/sec:  15621.18
Transfer/sec:      1.54GB

Running 30s test @ http://127.0.0.1:3000/3
  4 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    89.73ms   25.94ms 264.17ms   70.83%
    Req/Sec     2.79k   154.59     3.18k    77.09%
  331839 requests in 30.04s, 32.83GB read
Requests/sec:  11044.78
Transfer/sec:      1.09GB

I am attaching an archive with my example so that you can try the tests yourself. There are 2 HTML files in the static/html folder: index.html and index-100.html. The first one weighs 10kb, the second one weighs a little more than 100kb. benchmark.zip

jplatte commented 5 months ago

Yeah, ServeFile is not particularily optimized. However, note that tokio::fs::read_to_string buffers the entire file contents in memory, which can be problematic for large files. Additionally, I would expect larger files to show less of a difference in performance.

In any case, optimizing that service would be nice. If I recall correctly, it's currently a wrapper around ServeDir, so it will even do filetype detection and potentially other things once per request, rather than once on construction.