go-gota / gota

Gota: DataFrames and data wrangling in Go (Golang)
Other
2.97k stars 276 forks source link

Column Order Question #219

Open Amnesiac9 opened 10 months ago

Amnesiac9 commented 10 months ago

Hello,

When using GroupBy() and then Aggregation() the column order that outputs is not as I'd expect coming from Pandas.

Is there a reason the order is not preserved or the new aggregated columns are not simply appended in the order they are entered?

Gota seems to append columns in alphabetical order, is there a reason for that extra sorting step the user doesn't explicitly call?

The code below will not output the columns in the order they are entered into the []string

func cleanDataframe(df *dataframe.DataFrame) (*dataframe.DataFrame, error) {

    group := *df.GroupBy("CustomerId", "ShortSku")
    if group.Err != nil {
        return nil, group.Err
    }

    agg_df := group.Aggregation([]dataframe.AggregationType{
        dataframe.Aggregation_SUM,
        dataframe.Aggregation_FIRST,
        dataframe.Aggregation_FIRST,
        dataframe.Aggregation_FIRST,
        dataframe.Aggregation_FIRST,
        dataframe.Aggregation_FIRST,
        dataframe.Aggregation_FIRST,
        dataframe.Aggregation_FIRST,
        dataframe.Aggregation_FIRST,
        dataframe.Aggregation_FIRST,
        dataframe.Aggregation_FIRST,
        dataframe.Aggregation_FIRST,
        dataframe.Aggregation_SUM,
        dataframe.Aggregation_COUNT},
        []string{
            "Quantity",
            "CustomerName",
            "Address1",
            "Address2",
            "City",
            "State",
            "Zip",
            "Country",
            "Phone",
            "Email",
            "ClubEnrollment",
            "AccountType",
            "Spend",
            "OrderCount"})
    if agg_df.Err != nil {
        return nil, agg_df.Err
    }

    return &agg_df, nil
}