dvgodoy / handyspark

HandySpark - bringing pandas-like capabilities to Spark dataframes
MIT License
185 stars 23 forks source link

'HandyGrouped' object has no attribute 'session' #29

Closed cyotap closed 2 years ago

cyotap commented 2 years ago

image image

so i tried to work handyspark with the plotting and keep getting me the errors like these.

dvgodoy commented 2 years ago

Hi @Kimiko00,

Are you using Apache Spark 3.3? It looks like the sql_ctx attribute was finally deprecated from the GroupedData class when 3.3 was released.

It used to be like this:

    def __init__(self, jgd, df):
        self._jgd = jgd
        self._df = df
        self.sql_ctx = df.sql_ctx

And now it is like this:

    def __init__(self, jgd: JavaObject, df: DataFrame):
        self._jgd = jgd
        self._df = df
        self.session: SparkSession = df.sparkSession

HandySpark was built for Apache Spark 2.x, and unfortunately some functionalities may not work well (or at all!) under Apache 3.x. This particular error, though, should go away if you downgrade to Apache Spark 3.2 (the last version still using sql_ctx).

Best, Daniel

cyotap commented 2 years ago

thank you so much for the help, i'll go downgrade the Apache Spark version