Closed rluce closed 1 year ago
Some options:
series.agg(quicksum)
as a faster alternativeseries.grb.quicksum()
series.grb.sum()
If we plan to implement an extension type soon (which would allow us to override series.sum()
default behaviour) then perhaps it makes sense to just do (1) for now.
This needs to be done via documentation, as I don't think we can hook our own methods onto SeriesGroupBy
unless we create an extension data type.
Ultimately, I think we should implement an MVar-backed extension array for performance reasons, at which point supporting .agg(quicksum)
may become a headache. So we should document this something along the lines of ".sum() works correctly in all cases, you may see a noteworthy performance improveent using .agg(quicksum)
for large models. However unless you are using Gurobi<10, or have disabled the extension types (once the exist), you should always use pandas' built-in sum".
Ultimately, I think we should implement an MVar-backed extension array for performance reasons, at which point supporting
.agg(quicksum)
may become a headache. So we should document this something along the lines of ".sum() works correctly in all cases, you may see a noteworthy performance improveent using.agg(quicksum)
for large models. However unless you are using Gurobi<10, or have disabled the extension types (once the exist), you should always use pandas' built-in sum".
Our experiments indicate you should always use .agg(quicksum)
. So I don't think you should tell people to "always" use pandas' build-in sum
.
Agreed, we will suggest in the documentation to use agg(quicksum) as a faster alternative to the builtin sum.
Done on the performance page of the docs, and I also mentioned it in the initial API overview
People are used to that, and we should provide a best practice pattern.