ydataai / ydata-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
https://docs.profiling.ydata.ai
MIT License
12.49k stars 1.68k forks source link

Supporting Spark Connect Dataframe #1634

Open chenbojian opened 2 months ago

chenbojian commented 2 months ago

Missing functionality

After databricks runtime 14, the dataframe type is changed in notebook. It was pyspark.sql.dataframe.DataFrame, but now it is pyspark.sql.connect.dataframe.DataFrame it fails to work with ydata-profling because ydata-profiling expects either pandas.DataFrame or pyspark.sql.dataframe.DataFrame

Proposed feature

Support pyspark.sql.connect.dataframe.DataFrame for profiling

Alternatives considered

No response

Additional context

image
charleslondon commented 2 weeks ago

Bumping this as I also have this issue