Gurobi / gurobipy-pandas

Convenience wrapper for building optimization models from pandas data
https://gurobipy-pandas.readthedocs.io/en/stable/
Apache License 2.0
83 stars 15 forks source link

Give users of quicksum a handle #32

Closed rluce closed 1 year ago

rluce commented 1 year ago

People are used to that, and we should provide a best practice pattern.

simonbowly commented 1 year ago

Some options:

  1. Document series.agg(quicksum) as a faster alternative
  2. Implement using accessors as series.grb.quicksum()
  3. Implement using accessors series.grb.sum()

If we plan to implement an extension type soon (which would allow us to override series.sum() default behaviour) then perhaps it makes sense to just do (1) for now.

simonbowly commented 1 year ago

This needs to be done via documentation, as I don't think we can hook our own methods onto SeriesGroupBy unless we create an extension data type.

Ultimately, I think we should implement an MVar-backed extension array for performance reasons, at which point supporting .agg(quicksum) may become a headache. So we should document this something along the lines of ".sum() works correctly in all cases, you may see a noteworthy performance improveent using .agg(quicksum) for large models. However unless you are using Gurobi<10, or have disabled the extension types (once the exist), you should always use pandas' built-in sum".

Dr-Irv commented 1 year ago

Ultimately, I think we should implement an MVar-backed extension array for performance reasons, at which point supporting .agg(quicksum) may become a headache. So we should document this something along the lines of ".sum() works correctly in all cases, you may see a noteworthy performance improveent using .agg(quicksum) for large models. However unless you are using Gurobi<10, or have disabled the extension types (once the exist), you should always use pandas' built-in sum".

Our experiments indicate you should always use .agg(quicksum). So I don't think you should tell people to "always" use pandas' build-in sum.

rluce commented 1 year ago

Agreed, we will suggest in the documentation to use agg(quicksum) as a faster alternative to the builtin sum.

simonbowly commented 1 year ago

Done on the performance page of the docs, and I also mentioned it in the initial API overview