Kotlin / dataframe

Structured data processing in Kotlin
https://kotlin.github.io/dataframe/overview.html
Apache License 2.0
845 stars 63 forks source link

[Compiler plugin] Propagate nullability in toDataFrame tree conversion #942

Closed koperagen closed 2 weeks ago

koperagen commented 2 weeks ago

There are 3 cases where plugin assumed non-null type for column and it actually should be nullable. With this fix, compile schema matches runtime schema

Jolanrensen commented 2 weeks ago

I do have one potential bug: How does the compiler plugin receive type T from toDataFrame? Does it actually get all the type information like when calling the function or does it assume that the type can be received from the first index of generic types?

In other words, how does it handle something like this?

import org.jetbrains.kotlinx.dataframe.*
import org.jetbrains.kotlinx.dataframe.annotations.*
import org.jetbrains.kotlinx.dataframe.api.*
import org.jetbrains.kotlinx.dataframe.io.*

@DataSchema
data class D(
    val s: String
)

class Subtree(
    val p: Int,
    val l: List<Int>,
    val ld: List<D>,
)

class Root(val a: Subtree)

class MyList(val l: List<Root?>): List<Root?> by l

fun box(): String {
    val l = listOf(
        Root(Subtree(123, listOf(1), listOf(D("ff")))),
        null
    )
    val df = MyList(l).toDataFrame(maxDepth = 2)
    df.compareSchemas(strict = true)
    return "OK"
}
koperagen commented 2 weeks ago

This is an interesting question, i updated PR with the fix