The polars package for R gives users access to a lightning fast Data Frame library written in Rust. Polars’ embarrassingly parallel execution, cache efficient algorithms and expressive API makes it perfect for efficient data wrangling, data pipelines, snappy APIs, and much more besides. Polars also supports “streaming mode” for out-of-memory operations. This allows users to analyze datasets many times larger than RAM.
Examples of common operations:
Note that this package is rapidly evolving and there are a number of
breaking changes at each version. Be sure to check the
changelog when updating
polars
.
The recommended way to install this package is via R-multiverse:
Sys.setenv(NOT_CRAN = "true")
install.packages("polars", repos = "https://community.r-multiverse.org")
The “Install”
vignette
(vignette("install", "polars")
) gives more details on how to install
this package and other ways to install it.
To avoid conflicts with other packages and base R function names,
polars’s top level functions are hosted in the pl
namespace, and
accessible via the pl$
prefix. This means that polars
queries
written in Python and in R are very similar.
For example, rewriting the Python example from https://github.com/pola-rs/polars in R:
library(polars)
df = pl$DataFrame(
A = 1:5,
fruits = c("banana", "banana", "apple", "apple", "banana"),
B = 5:1,
cars = c("beetle", "audi", "beetle", "beetle", "beetle")
)
# embarrassingly parallel execution & very expressive query language
df$sort("fruits")$select(
"fruits",
"cars",
pl$lit("fruits")$alias("literal_string_fruits"),
pl$col("B")$filter(pl$col("cars") == "beetle")$sum(),
pl$col("A")$filter(pl$col("B") > 2)$sum()$over("cars")$alias("sum_A_by_cars"),
pl$col("A")$sum()$over("fruits")$alias("sum_A_by_fruits"),
pl$col("A")$reverse()$over("fruits")$alias("rev_A_by_fruits"),
pl$col("A")$sort_by("B")$over("fruits")$alias("sort_A_by_B_by_fruits")
)
#> shape: (5, 8)
#> ┌────────┬────────┬───────────────────────┬─────┬───────────────┬─────────────────┬─────────────────┬───────────────────────┐
#> │ fruits ┆ cars ┆ literal_string_fruits ┆ B ┆ sum_A_by_cars ┆ sum_A_by_fruits ┆ rev_A_by_fruits ┆ sort_A_by_B_by_fruits │
#> │ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
#> │ str ┆ str ┆ str ┆ i32 ┆ i32 ┆ i32 ┆ i32 ┆ i32 │
#> ╞════════╪════════╪═══════════════════════╪═════╪═══════════════╪═════════════════╪═════════════════╪═══════════════════════╡
#> │ apple ┆ beetle ┆ fruits ┆ 11 ┆ 4 ┆ 7 ┆ 4 ┆ 4 │
#> │ apple ┆ beetle ┆ fruits ┆ 11 ┆ 4 ┆ 7 ┆ 3 ┆ 3 │
#> │ banana ┆ beetle ┆ fruits ┆ 11 ┆ 4 ┆ 8 ┆ 5 ┆ 5 │
#> │ banana ┆ audi ┆ fruits ┆ 11 ┆ 2 ┆ 8 ┆ 2 ┆ 2 │
#> │ banana ┆ beetle ┆ fruits ┆ 11 ┆ 4 ┆ 8 ┆ 1 ┆ 1 │
#> └────────┴────────┴───────────────────────┴─────┴───────────────┴─────────────────┴─────────────────┴───────────────────────┘
The Get Started
vignette
(vignette("polars")
) provides a more detailed introduction to
polars.
While one can use polars as-is, other packages build on it to provide different syntaxes:
The online documentation can be found at https://pola-rs.github.io/r-polars/.
If you encounter a bug, please file an issue with a minimal reproducible example on GitHub.
Consider joining our Discord subchannel for additional help and discussion.