jorgecarleitao / arrow2

Transmute-free Rust library to work with the Arrow format
Apache License 2.0
1.07k stars 221 forks source link

THIS CRATE IS UNMAINTAINED

As of 2024-01-17 this crate is no longer maintained. See discussion #1429 for more details.

Arrow2: Transmute-free Arrow

test codecov

A Rust crate to work with Apache Arrow. The most feature-complete implementation of the Arrow format after the C++ implementation.

Check out the guide for a general introduction on how to use this crate, and API docs for a detailed documentation of each of its APIs.

Features

Safety and Security

This crate uses unsafe when strictly necessary:

We have extensive tests over these, all of which run and pass under MIRI. Most uses of unsafe fall into 3 categories:

We actively monitor for vulnerabilities in Rust's advisory and either patch or mitigate them (see e.g. .cargo/audit.yaml and .github/workflows/security.yaml).

Reading from untrusted data currently may panic! on the following formats:

We are actively addressing this.

Integration tests

Our tests include roundtrip against:

Check DEVELOPMENT.md for our development practices.

Versioning

We use the SemVer 2.0 used by Cargo and the remaining of the Rust ecosystem, we also use the 0.x.y versioning, since we are iterating over the API.

Design

This repo and crate's primary goal is to offer a safe Rust implementation of the Arrow specification. As such, it

Design documents about each of the parts of this repo are available on their respective READMEs.

FAQ

Any plans to merge with the Apache Arrow project?

Maybe. The primary reason to have this repo and crate is to be able to prototype and mature using a fundamentally different design based on a transmute-free implementation. This requires breaking backward compatibility and loss of features that is impossible to achieve on the Arrow repo.

Furthermore, the arrow project currently has a release mechanism that is unsuitable for this type of work:

This implies that the crate version is independent of the changelog or its API stability, which violates SemVer. This procedure makes the crate incompatible with Rust's (and many others') ecosystem that heavily relies on SemVer to constraint software versions.

Secondly, this implies the arrow crate is versioned as >0.x. This places expectations about API stability that are incompatible with this effort.

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.