h2oai / db-benchmark

reproducible benchmark of database-like ops
https://h2oai.github.io/db-benchmark
Mozilla Public License 2.0
321 stars 85 forks source link

pandas groupby sort=False for better performance #173

Closed jangorecki closed 3 years ago

jangorecki commented 3 years ago

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html

sort bool, default True Sort group keys. Get better performance by turning this off. Note this does not influence the order of observations within each group. Groupby preserves the order of rows within each group.