petl-developers / petl

Python Extract Transform and Load Tables of Data
MIT License
1.24k stars 193 forks source link

Support the brand new Pandas Dataframe alternatives #620

Open juarezr opened 2 years ago

juarezr commented 2 years ago

Problem description

It would be nice to support the brand new Dataframe besides Pandas.

Two interesting candidates would be:

Modin Overview

Scale your pandas workflow by changing a single line of code

Modin uses Ray or Dask to provide an effortless way to speed up your pandas notebooks, scripts, and libraries. Unlike other distributed DataFrame libraries, Modin provides seamless integration and compatibility with existing pandas code. Even using the DataFrame constructor is identical.

Polars Overview

Lightning-fast DataFrame library for Rust and Python

Polars is a lightning fast DataFrame library/in-memory query engine. Its embarrassingly parallel execution, cache efficient algorithms and expressive API makes it perfect for efficient data wrangling, data pipelines, snappy APIs and so much more.

Problem Description

Currently petl supports Pandas by using the functions petl.io.pandas.dataframe and petl.io.pandas.todataframe

Evolving this kind of feature would be important to research: