Kotlin / kotlin-spark-api

This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x
Apache License 2.0
455 stars 34 forks source link

`kotlin.Any! is unsupported` error when trying to create a DataFrame #210

Closed barrettc closed 10 months ago

barrettc commented 10 months ago

Apologies if this is a newbie question.

I have successfully created a Spark job that performs a page rank algorithm using Spark and GraphX. I'm able to print out the page ranks like:

        val ranks: VertexRDD<Any> = graphOps.pageRank(0.0001, 0.15).vertices()
        println("Page Rank")
        ranks
            .toJavaRDD()
            .sortBy({ it._2 }, false, vertices.numPartitions)
            .collect()
            .forEach { println(it) }

I would now like to create a new DataFrame for further processing. When I do:

ranks.toDF("id", "pageRank")

I get the error:

Exception in thread "main" java.lang.IllegalArgumentException: kotlin.Any! is unsupported
    at org.jetbrains.kotlinx.spark.api.EncodingKt.schema(Encoding.kt:359)
    at org.jetbrains.kotlinx.spark.api.EncodingKt.schema(Encoding.kt:309)
    at org.jetbrains.kotlinx.spark.api.EncodingKt.schema$default(Encoding.kt:186)
    at org.jetbrains.kotlinx.spark.api.EncodingKt$memoizedSchema$1.invoke(Encoding.kt:368)
    at org.jetbrains.kotlinx.spark.api.EncodingKt$memoizedSchema$1.invoke(Encoding.kt:367)
    at org.jetbrains.kotlinx.spark.api.Memoize1.invoke(Encoding.kt:377)
    at org.jetbrains.kotlinx.spark.api.EncodingKt.generateEncoder(Encoding.kt:146)

I'm not sure why this would not be allowed or what I should do to work around. Any hints are appreciated.

barrettc commented 10 months ago

Sorry for the noise - user error. Need to just cast my DF columns