DataColumn nullability in JDBC

I'd argue that KType nullability should always check actual column values. https://github.com/Kotlin/dataframe/blob/master/dataframe-jdbc/src/main/kotlin/org/jetbrains/kotlinx/dataframe/io/readJdbc.kt#L597 Which is done by infer = Infer.Nulls My reasoning is mostly related to notebooks Pros: you won't have to handle nullable values if given snapshot doesn't have any! Very convenient if you just want to work with specific fragment of data Cons: Imagine you want to rerun the same notebook, but this time data has nulls. Now, you'll have to modify your code to handle it, or it will be compilation error So, depending on your use case: explore data once vs reuse notebook, desirable behavior can vary. My suggestion here: to support re-usability of notebooks, JDBC integration should have method to import data schema from DB schema the same way as open api support does.

Things to consider here: it's already possible to write (or generate and edit) a data schema to rerun notebooks without problems. There're other operation that work like this: add, convert and other functions will create nullable KType only if there are nulls, as well as other data sources (discussion about this in context of Arrow: https://github.com/Kotlin/dataframe/issues/428 with additional argument about KType nullability)

public inline fun <reified R, T> DataFrame<T>.add(
    name: String,
    noinline expression: AddExpression<T, R>
): DataFrame<T> = add(name, Infer.Nulls,  expression)

Kotlin / dataframe

DataColumn nullability in JDBC #541