pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
30.54k stars 1.98k forks source link

Converting Spark to Polars and back #17857

Open ivanchai-1 opened 4 months ago

ivanchai-1 commented 4 months ago

Description

We have a problem converting uploads from Spark to Polars. This conversion is implemented in Pandas, but it takes a very long time to work on huge data. I would like to see this function in Polars for faster task calculations.

pandas_df = spark_df.toPandas() polars_df = pl.DataFrame(pandas_df) ↧↧↧↧ polars_df = pl.DataFrame(spark_df)

deanm0000 commented 4 months ago

https://stackoverflow.com/questions/73203318/how-to-transform-spark-dataframe-to-polars-dataframe

ion-elgreco commented 4 months ago

@deanm0000 https://github.com/apache/spark/pull/45481, toArrow() will be available in spark 4.0 afaik