Open richiesgr opened 3 weeks ago
0.9.72
After reading the documentation the pipeline type in file metadata.yaml cannot be set to databricks it's always return back to python.
In result there is no support to convert spark dataframe to pandas dataframe:
df_spark = spark.sql("select * table") return a spark dataframe generate a pickle errror
df_spark = spark.sql("select * table")
to make it work you need to explicitly convert to pandas df_spark = spark.sql("select * table") .toPandas() works
df_spark = spark.sql("select * table") .toPandas()
df_spark = spark.sql("select * from dev.test.student") return df_spark
Support databricks pipeline as documented Handle spark dataframe as expected return
Docker on macos
No response
Databricks has updates on their library. So the guide is outdated. We'll update it when we get time.
Mage version
0.9.72
Describe the bug
After reading the documentation the pipeline type in file metadata.yaml cannot be set to databricks it's always return back to python.
In result there is no support to convert spark dataframe to pandas dataframe:
df_spark = spark.sql("select * table")
return a spark dataframe generate a pickle errrorto make it work you need to explicitly convert to pandas
df_spark = spark.sql("select * table") .toPandas()
worksTo reproduce
Expected behavior
Support databricks pipeline as documented Handle spark dataframe as expected return
Screenshots
Operating system
Docker on macos
Additional context
No response