Function naming consistency

rrandall1471 commented 1 year ago

I think the pd_ leading in function names is a bit clunky since you're using pandas accessors so you probably realize that you're using a pandas function already.

Secondly, the naming should probably be consistent with the existing gurobipy function naming. So while I'm a fan of the pythonic snake case of add_vars and add_constrs, that does not match the existing functions addVars and addConstrs.

simonbowly commented 1 year ago

Yes, this was a tricky one. We wound up at pd_add_vars for two reasons:

pd_add_vars or add_vars, while inconsistent with gurobipy, is more consistent with pandas style (sort_values, to_frame, etc)
@Dr-Irv made the point to us earlier that using addVars may cause confusion since the function signatures here are subtly different (the model must be passed as a first argument). Hence the pd prefix to help disambiguate.

On balance, what do you think?

Dr-Irv commented 1 year ago

It depends on where you're coming from. If you're used to the Gurobi API, then you want to use the same symbols. If you are more used to pandas, then the camel case style makes sense.

If you get type checking to work (see my comment https://github.com/Gurobi/gurobipy-pandas/issues/12#issuecomment-1242065039), then I'd prefer the Gurobi naming.

On the other hand, if you are searching the docs, and you see an example, maybe using the pandas style (without the pd_ prefix) is better, because searching for add_vars would bring up the pandas-style docs.

Curious to hear @rrandall1471 's opinion

rrandall1471 commented 1 year ago

I definitely like the pythonic/pandonic naming (add_vars and add_constrs).

If people are used to using gurobipy they already expect to see the addVars or addConstrs type naming so it could be easier for them to convert to this library. However, as @Dr-Irv says above, being able to search on the new names might make it easier to find the right function.

Also, I think adding typing would be a great feature.

simonbowly commented 1 year ago

Now that we have the global functions we plan to fix the naming conventions as follows to keep things consistent.

Standard imports in examples:

import pandas as pd
import gurobipy as gp
from gurobipy import GRB
import gurobipy_pandas as gppd

gppd.add_vars(...)
gppd.add_constrs(...)

gppd seems the most suitable abbreviation. We cannot use gp for gurobipy_pandas (and, say, grb for gurobipy) since gp is our standard abbreviation for gurobipy everywhere else. gpd is another alternative, but seems too much of a typo risk. Concatenating the two well known abbreviations should be easy to remember.

For consistency, use gppd in the accessors as well:

m = gp.Model()

supply_df = (
    region_data
    .gppd.add_vars(m, name="supply", lb="min_supply", ub="max_supply")
    .gppd.add_vars(m, name="price", lb="min_price", ub="max_price")
    .gppd.add_vars(m, name="demand")
)

solution = seriesvar.gppd.X

For all methods in this library, we'll follow python naming conventions i.e. snake case. Unfortunately, we have to be inconsistent with either pandas/python/PEP8 or gurobipy, and it makes more sense to go the modern route.

There may be some confusion for gurobipy natives, but we can consider adding some informative errors to help.

simonbowly commented 1 year ago

Done as above

Dr-Irv commented 1 year ago

Just my opinion here. We have been using import gurobipy as grb in all of our work. Not a fan of the gp usage. One reason is that the "p" in gp stands for "python", but you know you're in python anyway!!

With that abbreviation, then import gurobiby_pandas as grb_pd then makes sense, and it is clear that you are joining Gurobi with pandas.

You could leave the accessor in the pandas objects as grb.

simonbowly commented 1 year ago

Yes, I've noticed that. Users can of course name things however they like, but for us gp is the standard in all our examples and we don't want to create confusion with a rename just in this library.

simonbowly commented 1 year ago

@rluce thoughts on using .grb for series-wise attribute access but keeping .gppd for the construction accessors? It's unusual to have the two name stubs in one library, but they do serve very different purposes.

import gurobipy_pandas as gppd

gppd.add_vars(...)
gppd.add_constrs(...)

# gppd accessors create gurobipy (gp) objects from pandas (pd) objects
df.gppd.add_vars(...)
df.gppd.add_constrs(...)

# grb accessors access gurobi attributes
x.grb.Obj = ...
solution = x.grb.X

simonbowly commented 1 year ago

Still some review needed for when we call down to gurobipy objects. setAttr or set_attr? getValue or get_value?

Current thinking, use snake case, but make sure sensible errors come back by implementing #47 for cases where we are inconsistent

simonbowly commented 1 year ago

Concerned that .grb will ultimately become confusing. It sort of makes sense if the series accessor only refers to gurobi variable and constraint attributes, but we may add other functionality (e.g. get_value & others) which doesn't fit that category. In that case, keeping the accessor .gppd to signify it's a gurobipy_pandas component makes more sense.

Dr-Irv commented 1 year ago

Maybe it should be .grb_pd instead of .gppd

Gurobi / gurobipy-pandas

Function naming consistency #38