go-gota / gota

Gota: DataFrames and data wrangling in Go (Golang)
Other
3.04k stars 281 forks source link

Records method but for float64 #44

Closed MaxHalford closed 6 years ago

MaxHalford commented 6 years ago

Hello,

First of all I would like to show my appreciation for this library, it does a lot of redundant heavy-lifting.

For a machine learning project I'm using gota to load a CSV file and input the data into an algorithm. The thing is need to cast a DataFrame to a [][]float64 slice of slices. I noticed there is a DataFrame.Records method to cast the DataFrame as a slice of slices of strings. Would it in any way be possible to do the same thing for the float64? I think this is really practical because it is a common use-case for machine learning applications.

Regards.

kniren commented 6 years ago

Hey @MaxHalford, thank you very much for your interest and your nice words. On the README you can find a use case for transforming a DataFrame into something that can be used with Gonum.

Now according to your feature request, you would want to transform a DataFrame into a [][]float64. I guess the first question is, do you need it to be row wise or column wise? If it's the latter you could loop through the columns on the DataFrame, which will return a Series you can then just call the Series.Float method there to get a slice of slices of floats.

What are your thoughts on this?

MaxHalford commented 6 years ago

Yes in the end I did a loop.

func dataFrameToFloat64(df dataframe.DataFrame) [][]float64 {
    var X = make([][]float64, df.Ncol())
    for i, col := range df.Names() {
        X[i] = df.Col(col).Float()
    }
    return X
}

I guess there isn't much point putting this in the gota API because it's so easy to implement. Thanks a lot for your answer!