dodger487 / dplython

dplyr for python
MIT License
764 stars 58 forks source link

Speed up grouped mutates #63

Closed dodger487 closed 8 years ago

dodger487 commented 8 years ago

This rewrites some of the Later and DplyFrame code to greatly speed up mutates on grouped dataframes. In some cases I am seeing a roughly 10x improvement (5000 groups on the 54k row diamonds dataframe). I think we can further improve this performance, hopefully by another 10x, in the future by being smarter with the code I've added here.