aws / aws-sdk-pandas

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
https://aws-sdk-pandas.readthedocs.io
Apache License 2.0
3.89k stars 690 forks source link

Add support for polars (perhaps via narwhals) #2920

Open david-waterworth opened 1 month ago

david-waterworth commented 1 month ago

Is your feature request related to a problem? Please describe. I've mostly migrated my python analytics workflow from pandas to polars. This means in order to use a lot of great libraries such as aws-sdk-pandas I need to convert between polars and pandas

Describe the solution you'd like aws-sdk-pandas seems like a great candidate for migration to being dataframe agnostic, i.e. by internally using narwhals as a dataframe agnostic api.

An alternative might be to refactor methods that accept dataset=True to instead take a string (i.e. dataset="pandas" or dataset="polars").

jaidisido commented 1 month ago

Thanks @david-waterworth, we have considered other data frame formats such as polars in the past but have found that there is very little uptake compared to pandas for us to justify the dev effort that would be required to include them. We will keep on eye this however and might reconsider in the future