sunchao / parquet-rs

Apache Parquet implementation in Rust
Apache License 2.0
149 stars 20 forks source link
hadoop parquet rust

parquet-rs

Build Status Coverage Status License API docs Master API docs

An Apache Parquet implementation in Rust.

NOTE: this project has merged into Apache Arrow, and development will continue there. To file an issue or pull request, please file a JIRA in the Arrow project.

Usage

Add this to your Cargo.toml:

[dependencies]
parquet = "0.4"

and this to your crate root:

extern crate parquet;

Example usage of reading data:

use std::fs::File;
use std::path::Path;
use parquet::file::reader::{FileReader, SerializedFileReader};

let file = File::open(&Path::new("/path/to/file")).unwrap();
let reader = SerializedFileReader::new(file).unwrap();
let mut iter = reader.get_row_iter(None).unwrap();
while let Some(record) = iter.next() {
  println!("{}", record);
}

See crate documentation on available API.

Supported Parquet Version

To update Parquet format to a newer version, check if parquet-format version is available. Then simply update version of parquet-format crate in Cargo.toml.

Features

Requirements

See Working with nightly Rust to install nightly toolchain and set it as default.

Build

Run cargo build or cargo build --release to build in release mode. Some features take advantage of SSE4.2 instructions, which can be enabled by adding RUSTFLAGS="-C target-feature=+sse4.2" before the cargo build command.

Test

Run cargo test for unit tests.

Binaries

The following binaries are provided (use cargo install to install them):

If you see Library not loaded error, please make sure LD_LIBRARY_PATH is set properly:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$(rustc --print sysroot)/lib

Benchmarks

Run cargo bench for benchmarks.

Docs

To build documentation, run cargo doc --no-deps. To compile and view in the browser, run cargo doc --no-deps --open.

License

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0.