Closed streamdp closed 2 years ago
Hi @streamdp ,
to be honest, I never had an issue with leading or trailing spaces in column names. How did extra spaces come in your way?
If we want to trim the column names, we should do it as early as possible. Trimming names in Names()
is IHMO too late. But it's possible to set Series.Name
- in this case we shouldn't mutate the Name.
Do you have an issue with leading / trailing spaces in column names? Where did the names come from? We you give an example?
Hi @chrmang, I started learning ML on Golang and found an example where we used your library and golearn. The example Golang for Machine Learning solved the problem of classifying the iris using the dataset iris_headers.csv. This dataset has these issues. I think this is a dataset issue, but from a tool usability point of view, it would be a good idea to take care of it. You are right, this should be done earlier. For example: https://github.com/go-gota/gota/blob/f70540952827cfc8abfa1257391fd33284300b24/dataframe/dataframe.go#L1212 Right? Much earlier in ReadAll () we couldn't do this because we didn't know if we had a header in the file.
Yes, this line is the best one to trim header names.
I will fix this in the next few days. If you want to support, you are welcome to open a pull request.
Why don't we trim the spaces when getting the list of column names? https://github.com/go-gota/gota/blob/f70540952827cfc8abfa1257391fd33284300b24/dataframe/dataframe.go#L1555
When working with a data frame, it's really helpful not to think about extra spaces in column names. Maybe we can improve this elsewhere (for example here: https://github.com/go-gota/gota/blob/f70540952827cfc8abfa1257391fd33284300b24/dataframe/dataframe.go#L2164 ), but I want to ask you if this is a special feature :)