go-gota / gota

Gota: DataFrames and data wrangling in Go (Golang)
Other
2.97k stars 276 forks source link

Trim the spaces when getting the list of column names? #173

Closed streamdp closed 2 years ago

streamdp commented 2 years ago

Why don't we trim the spaces when getting the list of column names? https://github.com/go-gota/gota/blob/f70540952827cfc8abfa1257391fd33284300b24/dataframe/dataframe.go#L1555

When working with a data frame, it's really helpful not to think about extra spaces in column names. Maybe we can improve this elsewhere (for example here: https://github.com/go-gota/gota/blob/f70540952827cfc8abfa1257391fd33284300b24/dataframe/dataframe.go#L2164 ), but I want to ask you if this is a special feature :)

chrmang commented 2 years ago

Hi @streamdp ,

to be honest, I never had an issue with leading or trailing spaces in column names. How did extra spaces come in your way?

If we want to trim the column names, we should do it as early as possible. Trimming names in Names() is IHMO too late. But it's possible to set Series.Name - in this case we shouldn't mutate the Name.

Do you have an issue with leading / trailing spaces in column names? Where did the names come from? We you give an example?

streamdp commented 2 years ago

Hi @chrmang, I started learning ML on Golang and found an example where we used your library and golearn. The example Golang for Machine Learning solved the problem of classifying the iris using the dataset iris_headers.csv. This dataset has these issues. I think this is a dataset issue, but from a tool usability point of view, it would be a good idea to take care of it. You are right, this should be done earlier. For example: https://github.com/go-gota/gota/blob/f70540952827cfc8abfa1257391fd33284300b24/dataframe/dataframe.go#L1212 Right? Much earlier in ReadAll () we couldn't do this because we didn't know if we had a header in the file.

chrmang commented 2 years ago

Yes, this line is the best one to trim header names.

I will fix this in the next few days. If you want to support, you are welcome to open a pull request.