rocketlaunchr / dataframe-go

DataFrames for Go: For statistics, machine-learning, and data manipulation/exploration
Other
1.19k stars 95 forks source link

Add support for CSV without headers row #47

Closed pasdam closed 3 years ago

pasdam commented 3 years ago

This simply adds the support to import CSV files without a headers row.

In case the ColumnNames options is specified, it uses it to set the series names, instead of reading the first row.

It moves the if row == 0 { outside the for loop to avoid to do the check for each row read.

pjebs commented 3 years ago

Thanks for the contribution. I'm still looking at it.

The reason for the delay is I'm wondering if we should just artificially manipulate r io.ReadSeeker to add the headers for first row?

pasdam commented 3 years ago

Hey, no worries, take your time. Out of curiosity, what would be the advantage of "artificially manipulate" the io.ReadSeeker? Sounds a bit hacky

pjebs commented 3 years ago

To answer your question: You can see how LimitReader works: https://golang.org/pkg/io/#LimitReader

When you look at the code, it does the opposite of what we were trying to achieve. It was trying to prematurely limit how many bytes were being read. We were trying to artificially prepend bytes and increase the number of readable bytes.

You can see the standard library already provides a https://golang.org/pkg/io/#MultiReader. What I really wanted was a MultiReaderSeeker (due to LargeDataset option needing to reset back to 0th position). The standard library didn't offer it. So I found a third party implementation.

pasdam commented 3 years ago

I got the usage of the MultiSeeker, I was more interested about the advantage of simulating the headers reading versus simply read the first line as header when needed

pjebs commented 3 years ago

It was mostly to keep the rest of the code minimally touched because it was well tested and used, and artificially adding headers was least intrusive.

But I also felt it was the best approach based on the 'philosophy' of the io package.

pjebs commented 3 years ago

@pasdam If you want to work on other features let me know. There is a todo list.

pasdam commented 3 years ago

Sure, I will

pjebs commented 3 years ago

You said earlier: Hey, no worries, take your time.. Are you Australian?