julioasotodv / spark-df-profiling

Create HTML profiling reports from Apache Spark DataFrames
MIT License
195 stars 77 forks source link

Running in DataBricks #17

Open lordoetl opened 6 years ago

lordoetl commented 6 years ago

I have loaded a dataframe and when I run the command profile = spark_df_profiling.ProfileReport(df)

I get the following error:

pycache not bottom-level directory

I have confirmed the df is loaded and looking good (it is very large if that matters), not sure where to go next with this, suggestions?

It occurred to me that I was running on a serverless cluster so tried your example code on a Standard just to make sure that wasn't it:

Ran: import spark_df_profiling df = sqlContext.createDataFrame([["2",True,None,"8"], ["2",False,None,"8"], ["2",True,"5","7"]], ["a","b","c","d"]) rep = spark_df_profiling.ProfileReport(df) displayHTML(rep.html)

Error: pycache not bottom-level directory in ....

mparkhe commented 5 years ago

Hey Folks,

I created a PR #22 to display renderable HTML in Databricks notebook.

mani