Kotlin / dataframe

Structured data processing in Kotlin
https://kotlin.github.io/dataframe/overview.html
Apache License 2.0
806 stars 55 forks source link

Add a User Guide "How to handle large CSV?" #765

Open zaleslaw opened 2 months ago

zaleslaw commented 2 months ago

Users often asks about limitations of KDF to handle large dataframes

The User Guide should contain some recommendations and snippets of code to improve User Path here

Related to the #141

Jolanrensen commented 3 weeks ago

I tried reading the 800+ MB csv file from here and I keep running into OOM errors. It might be a good candidate for trying to get it into a DataFrame or working with it.

It contains about 34,959,672 rows of data.

(Edit: It only runs OOM from JUnit tests. It works fine with enough memory from a main() function or a notebook)