sunchao / parquet-rs

Apache Parquet implementation in Rust
Apache License 2.0
149 stars 20 forks source link

Ergonomics idea: closure/RAII-based writer access #214

Open aldanor opened 3 years ago

aldanor commented 3 years ago

Here's a real example:

let mut row_group_writer = writer.next_row_group()?;
for field_id in 0..N_FIELDS {
    if let Some(mut col_writer) = row_group_writer.next_column()? {
        if let ColumnWriter::DoubleColumnWriter(ref mut writer) = &mut col_writer {
            let data = make_column_data(field_id, N_ROWS);
            assert_eq!(writer.write_batch(&data, None, None)?, N_ROWS);
            row_group_writer.close_column(col_writer)?;
            continue;
        }
    }
    bail!("unable to open column writer");
}
writer.close_row_group(row_group_writer)?;
writer.close()?;

What if instead you could do something like this (discussable, there's many ways to do this):

writer.write_row_group(|w| {
    for field in 0..N_FIELDS {
        w.write_columns(|i, w| {
            let data = make_column_data(i, N_ROWS);
            w.as_double()?.write_batch(&data, None, None)?;
            Ok(())
        })?;
    }
    Ok(())
})?;

This way you rely on RAII to close things where needed and there's a bit less boilerplate.

nevi-me commented 3 years ago

Hey @aldanor, saw your fast-float crate, then on your GH profile I noticed that you opened an issue on this repo. Development has moved to https://github.com/apache/arrow, where the parquet crate now lives in the rust/parquet repository.

Would you mind opening this issue in our JIRA at https://issues.apache.org/jira/projects/ARROW/issues?