JuliaData / CSV.jl

Utility library for working with CSV and other delimited files in the Julia programming language
https://csv.juliadata.org/
Other
459 stars 141 forks source link

CSV.write should conditionally convert type unstable iterators #1087

Open robsmith11 opened 1 year ago

robsmith11 commented 1 year ago

After trying to save a 6 column DataFrame to a 5GB CSV file, I had to kill the Julia session after a few minutes of it heavily swapping on my laptop with 16GB of memory.

As pointed out here [1], the memory usage can be significantly reduce by making the iteration type-stable with the Tables.columntable function. I was able to write my file in a few seconds after making that change.

Since it's quite common to work with narrow but long DataFrames, shouldn't CSV.write just check the dimensions and decide when to convert to a type stable table?

[1] https://stackoverflow.com/questions/65584387/julia-csv-write-very-memory-inefficient