abronte / PysparkProxy

Seamlessly execute pyspark code on remote clusters
Other
4 stars 0 forks source link

Additional UDF support #20

Closed abronte closed 5 years ago

abronte commented 6 years ago

Add support for creating UDFs via decorator or the udf function.

from pyspark.sql.functions import udf

squared_udf = udf(squared, LongType())
df = sqlContext.table("test")
display(df.select("id", squared_udf("id").alias("id_squared")))
from pyspark.sql.functions import udf

@udf
def squared(x):
    return x * x

df.select("id", squared("id").alias("id_squared"))