Closed ritchie46 closed 3 years ago
Hi, thanks for suggestion. Please be aware there is already similar suggestion filled in #107 asking for Rusk DataFussion, which also uses arrow. Does Polars outsource all queries to arrow, or some algorithms are implemented in Polars? There might be not much sense to add multiple libraries which all will use same "engine" for execution queries.
Arrow is only the backend (like numpy for pandas). All DataFrame/ Query logic (join, groupby, filters, pivots, etc) is implemented by Polars.
DataFusion is indeed also based on Apache Arrow, but is a totally different DataFrame approach.
Polars is an in-memory DataFrame library in Rust, that uses apache arrow as backend. It is also available via Python.
There are some benchmarks done against pandas in the readme and from Python in this notebook.