go-gota / gota

Gota: DataFrames and data wrangling in Go (Golang)
Other
3.04k stars 281 forks source link

ReadAll has a bad failure condition #58

Open 17twenty opened 6 years ago

17twenty commented 6 years ago

Was using the project and noticed a weird situation on low memory machines where data ended up missing on different runs from the bottoms rows of CSV files.

I'm pretty sure this is the culprit - There's a nice tangential article on why readall functions are considered bad

Looking at the csv.ReadAll function it will allocate up until max memory and then just drop records on the floor. Due to the interface provided by gota - there's no way to pass a Reader() style interface which would allow us to work around it.

Any thoughts on fixing it?