Kotlin / dataframe

Structured data processing in Kotlin
https://kotlin.github.io/dataframe/overview.html
Apache License 2.0
761 stars 48 forks source link

Pivot with `default()` of different type does not re-infer type #734

Closed Jolanrensen closed 2 weeks ago

Jolanrensen commented 2 weeks ago

Relates to https://github.com/Kotlin/dataframe/issues/713

Given

val df = dataFrameOf("firstName", "lastName", "age", "city", "weight", "isHappy")(
    "Alice", "Cooper", 15, "London", 54, true,
    "Bob", "Dylan", 45, "Dubai", 87, true,
    "Charlie", "Daniels", 20, "Moscow", null, false,
    "Charlie", "Chaplin", 40, "Milan", null, true,
    "Bob", "Marley", 30, "Tokyo", 68, true,
    "Alice", "Wolf", 20, null, 55, false,
    "Charlie", "Byrd", 30, "Moscow", 90, true
).group("firstName", "lastName").into("name")
---
df.pivot { city }.groupBy { name }.default(0).min()

we get pivoted.city.London.isHappy.type() == typeOf<Boolean>(), while the column values are: [true, 0, 0, 0, 0, 0, 0]

I narrowed the issue down to concatImpl(). This collects column types when just col != null. When defaultValue is used, this is not added to the set of types, skipping guessType.

It can fix 3 tests in https://github.com/Kotlin/dataframe/issues/713