-
There are certain connectors that can't always make pushdown guarantees on the coordinator level. For example, ORC files may contain headers describing the min and max values within a certain row grou…
-
[strip_chars](https://docs.pola.rs/py-polars/html/reference/expressions/api/polars.Expr.str.strip_chars.html) (`strip` in pandas) is a common operation used during data cleaning.
```python
import …
-
I've spent some time familiarising myself with Fling and I am positively impressed by your work. Naturally, I am starting to hit some limitations as I explore further. So the goal of this issue is to …
-
### Checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the [latest version](https://pypi.org/project/polars/) of Polars.
### Reprodu…
-
**Is your feature request related to a problem?**
Apache Iceberg is designed for managing large analytic tables in a scalable and performant way, using features like schema evolution, partitioning,…
-
### Description
I want to something like
1) Scan lots of parquets, do some processing
2) (Optionally) write rows to an output parquet for debugging
3) Aggregate the rows
Eg.
```
foo = pl.sc…
-
### Is your proposal related to a problem?
We currently send label names/values are bare strings in each Series() call. Even with gRPC compression turned on, I think compression compresses each str…
-
woke — Check for insensitive language in your source code
https://github.com/get-woke/woke
-
**Is your feature request related to a problem?**
In threat hunting its often the case that you need to "join" on the same table for queries. For example: take a flat index filled with processes and …
-
In distributed computing context it would be nice to have a vector-variant of `Arrow.Stream` iterator. The idea is to be able to split processing of a single large arrow file with multiple record batc…