Kotlin / dataframe

Structured data processing in Kotlin
https://kotlin.github.io/dataframe/overview.html
Apache License 2.0
761 stars 48 forks source link

DataFrame.parse should allow users to pass their custom parsers #741

Open koperagen opened 1 week ago

koperagen commented 1 week ago

Suggested API:

class MyWrapper(val value: Int)

//either global configuration
DataFrame.parsers.add(
    stringParser { if(it.endsWith("%")) it.dropLast(1).toIntOrNull()?.let { MyWrapper(it) } else null }
)

// or provide instance here 
val df = dataFrameOf("a","b")("55%", "12%").parse(/*ParserOptions(...)*/)
df["a"].type() shouldBe typeOf<MyWrapper>()
df["b"].type() shouldBe typeOf<MyWrapper>()

Should save some effort manually selecting columns for conversion or writing a complex column selector that should somehow tell column can be parsed.

dataFrameOf("a","b")("55%", "12%").convert { 
    colsAtAnyDepth()
        .colsOf<String>()
        .filter {
            it.values().all { it.toPercentageOrNull() != null } 
        } 
}.with { it.toPercentageOrNull()!! }

Instead, all you'd need to define is (String) -> YourType? function