Open risenW opened 4 years ago
Any updates on this? Seems like it would be a compelling feature to make this the default pandas go-to in the javascript ecosystem.
Any updates on this? Seems like it would be a compelling feature to make this the default pandas go-to in the javascript ecosystem.
We will work on this, just not in the roadmap atm. Unless someone decides to pick it up. Would you be interested ?
Point me to the relevant code & I'll take a look!
tbh this is currently a bit of a blocker to me even doing a spike with danfo - but if we do it, then I'd like to know how hard it'd be to implement a pivot.
Here's a gist I wrote to do a pivot on an array of melted objects with lodash: doubt it's this easy! https://gist.github.com/nite/6ffda3d61278dccfb2152f8565492009
@nite i don't think it will be that hard to implement, to implement pivot_table
i think we need to implement how to access and display multi-index
table.
But however, we can get started without the above
But with my little knowledge of pivot table, to create the main functionality for pivot_table
without including some more complicated functionality as included in pandas
The main functionality of pivot_table
from pandas API pivot_table(data, values=None, index=None, columns=None, aggfunc='mean')
can be implemented as follows:
index
is given, which will be a list of columns name. We need to group the DataFrame by each of the columns in index
. Hence we can have an object containing each column and their grouby dataframe e.g {col1: df.groupby(['col1']), col2: df.grouby(['col2']) }
values
is not given then df.groupby([col])
for each column in index
is just like grouping the whole dataframe by col
but if values
is given, then we are grouping the DataFrame column
in values
by col
from index
e.g {col1: groupby(['col1']).col(values), . . . .}
columns
is given that means we want to perform more than one column grouping on the DataFrame e.g {col1: groupby(['col', ...columns]), . . . .}
. But I think instead of doing this at once like grouby('col1', ...columns])
we will need to loop through the columns like this:
for (I in columns){
column = columns[i]
pivotTableGraph['col1'][column] = groupby([`col1', column])
}
aggfunc
is given and not an array, then the operation will look like this grouby(['col']).mean()
that's if we assume aggfunc
is mean
. But if aggfunc
is given like this {col1: 'mean', col2: 'sum'}
then we will use groupby(['col']).agg(aggfunc)
At the end of this operation, we would have a giant object containing the result of this operation, this object can be considered to be a graph.
To have a concrete view of the above implementation steps, you can check out pivot_table
examples here: https://www.analyticsvidhya.com/blog/2020/03/pivot-table-pandas-python/ and compare them with the above implementation details.
@nite I think this is all we need to implement the main functionality of pivot_table
Cc: @risenW
Point me to the relevant code & I'll take a look!
tbh this is currently a bit of a blocker to me even doing a spike with danfo - but if we do it, then I'd like to know how hard it'd be to implement a pivot.
Here's a gist I wrote to do a pivot on an array of melted objects with lodash: doubt it's this easy! https://gist.github.com/nite/6ffda3d61278dccfb2152f8565492009
Also to add @nite You would implement this in the DataFrame class here. Your output is going to be a DataFrame, so something of this signature:
/ **
*Some doc here
* @return DataFrame
*/
pivot() {
const data = this.values //get the inner array representing the DataFrame
//your pivot code to manipulate the data
...
...
// return a new DataFrame with the pivoted values
const df = new DataFrame(pivoted_data, { columns: this.column_names, index: indx });
return df;
}
Hi everyone, are there any updates on adding pivot functionality to a dataframe? I looked at the documentation and could not find anything on this matter. It would be fantastic to have pivoting in danfo.js.
That would be very useful also for me
Pivot table support for DataFrame