Open evangeliazve opened 3 years ago
@evangeliazve - thanks for reporting this. I'm not sure how to fix the issue. Can you please send me your exact code and the full error stack trace, so I can try to replicate the issue on my machine? Thanks!
Hello @MrPowers, thanks for your reply.
When I execute the following code everything goes fine : ! pip install ceja import ceja
actual_df = df_txts.withColumn("list_of_words_stem", ceja.porter_stem(col("list_of_words")))
However, even though the objet class is dataframe when I use the .show() fonction to show up the result table I obtain the following error message:
PythonException Traceback (most recent call last)
Hello,
I am facing issues when trying to apply stemming on text data in AWS with Pyspark. Here is the error message I'm getting: PythonException: An exception was thrown from a UDF: 'pyspark.serializers.SerializationError: Caused by Traceback (most recent call last):
How can I resolve this?
Thank you for your support.
Best, Evangelia