jmcnamara / rust_xlsxwriter

A Rust library for creating Excel XLSX files.
https://crates.io/crates/rust_xlsxwriter
Apache License 2.0
250 stars 23 forks source link

feature request: add optional support for the ryu crate to speed up writing numeric worksheet data #93

Closed jmcnamara closed 1 month ago

jmcnamara commented 1 month ago

Add a Cargo.toml feature to optionally use the ryu crate to speed up writing worksheet numeric data.

This has a positive effect when writing large numeric data sets. Specifically it has a benefit when writing more than 300,000 numeric data cells and when writing 5,000,000 numeric cells it can be 30% faster. See the table below.

Number of cells Ryu/Standard
100,000 95%
300,000 100%
500,000 105%
750,000 110%
1,000,000 114%
2,000,000 124%
3,000,000 127%
4,000,000 129%
5,000,000 131%

Excel chart

The benchmark was run using the following program:


use rust_xlsxwriter::{Workbook, XlsxError};
use std::env;

fn main() -> Result<(), XlsxError> {
    let args: Vec<String> = env::args().collect();

    let mut workbook = Workbook::new();

    let col_max = 50;
    let mut row_max = match args.get(1) {
        Some(arg) => arg.parse::<u32>().unwrap_or(4_000),
        None => 4_000,
    };

    row_max /= 50;

    let worksheet = workbook.add_worksheet();

    for row in 0..row_max {
        for col in 0..col_max {
            worksheet.write_number(row, col, 12345.0)?;
        }
    }

    workbook.save("rust_perf_test.xlsx")?;

    Ok(())
}

The benchmarking was run with hyperfine:

$ hyperfine --warmup 3 "./app_perf_test4_orig 5000000" "./app_perf_test4_ryu 5000000"
Benchmark 1: ./app_perf_test4_orig 5000000
  Time (mean ± σ):      5.859 s ±  0.061 s    [User: 5.528 s, System: 0.306 s]
  Range (min … max):    5.767 s …  5.946 s    10 runs

Benchmark 2: ./app_perf_test4_ryu 5000000
  Time (mean ± σ):      4.462 s ±  0.035 s    [User: 4.150 s, System: 0.298 s]
  Range (min … max):    4.414 s …  4.523 s    10 runs

Summary
  ./app_perf_test4_ryu 5000000 ran
1.31 ± 0.02 times faster than ./app_perf_test4_orig 5000000

The change was only added to the worksheet number cell writing code since that is the only part of the inner loop where it would have a substantive effect.

jmcnamara commented 1 month ago

As with all benchmarks, different scenarios will produce different results. It would be best to performance test your particular scenario with and without ryu to see if it is worth enabling.

If you get positive (or negative) results please let me know.